DEV Community

Chandrasing Patil
Chandrasing Patil

Posted on

ReST: The Need to Talk

Originally published on HackerNoon, this post is now available on dev.to.

In the history of human existence, communication serves as the indispensable thread that weaves connections, understanding, and knowledge. Picture this: in a corner of the comic world, on November 7, 1989, Calvin from Calvin and Hobbes engages in a conversation with his dad over the phone. This seemingly ordinary scene encapsulates a universal truth

– the need to talk.

Calvin needs help

No one is omniscient, see Calvin didn't know 11 + 7. The complexities of life, the myriad of experiences, and the wealth of information that surrounds us demand an incessant quest for understanding. We need to talk to others, at times, to seek answers, perspectives, or simply to share our thoughts. However, this act of communication is not haphazard; it follows a set of unspoken rules, guidelines, and etiquette that transform it into a nuanced art. Communication is, in essence, the bridge that connects minds, allowing ideas to flow, emotions to be shared, and knowledge to be transferred. Yet, mastering this art is no small feat. The intricacies of language, the subtleties of tone, and the all those non-verbal cues all contribute to the richness of human interaction.

Now, as a coder, I find a parallel allure in translating this intricate communication into the realm of machines talking to machines. In the vast landscape of programming, where precision and clarity are paramount, communication takes a different form.

API

An acronym for "Application Programming Interface", though sounding just too technical, is just a term for machines talking to machines. As with any communication, there are varied kinds of it too. SOAP (Simple Object Access Protocol), ReST (Representational State Transfer), RPC (Remote Procedure Calls), etc. just to name a few.

It's worth noting that communication in the digital realm isn't confined to APIs alone. There's also the intriguing realm of event-driven communication – an asynchronous data exchange that doesn't always wear the API label prominently. Yet, it plays a crucial role in orchestrating seamless interactions between different components, making systems agile and responsive.

ReST

I have been dealing with web-apis, flavors of ReST, for almost two decades now. And I say flavors of ReST as ReST is not as stubborn as other such protocols as SOAP simply because it is not a protocol, rather just set of guidelines or architectural style.

That sets ReST apart from others in very distinct way. ReST makes machine-to-machine communication more humane, simple and natural. Detailing a ReST api is more philosophical than technical, because one always starts with nouns when designing a ReST api.

What's in the Name?

Seriously what’s in the name? why are we so obsessed with names and naming things? I had blogged about it a decade ago. It was and still is an opinion of mine that names are nothing more than an abstraction to details. Without names conversation would be too wordy, verbose and still not very clear. We tend to name things and to make things clearer, we add more discriminating names. More the number of nouns, richer the vocabulary. Eskimos have twenty odd names for different kinds of ice. Names are not just limited to things or places, but also cover many actions.

Actions

But there is a stark difference between names and action. If names and actions are a sets, mathematical sets, you would notice cardinality, number of items in the set, of names increases overtime whereas same is not true for actions. Cut as an action has same name when you are cutting a paper as well as cutting vegetable. Actions, thus, are polymorphic in nature. Same action can be tied to multiple nouns.

HTTP

Enough with philosophy, let’s now focus on technicality. ReST came along with HTTP, and it is hands-in-glove relation. There are few misunderstandings about HTTP, mostly about usage. Most folks look at HTTP as a Transport protocol, they are not to blame though, you can download files with HTTP, so one can easily fall for it. Confusion of HTTP being a Transport protocol stems from the fact that the acronym or abbreviation literally is

Hyper-Text Transfer Protocol

When you perform an action using HTTP, it usually involves some sort of Document Shuffle, you are either sending a document to, or receiving one from, a server. The word you really want to focus on is Hyper-Text. Since most of the time the substance of a sentence is usually in the middle, unfortunately many fell for Transfer than Hyper-Text and synonymously used Transport in place of Transfer wherein both are literally different terms.

The fact is HTTP is “Application Protocol”

It works on top of real transport protocols as TCP/IP (Connection Oriented) or UDP or QUIC (Connection Less) in case of HTTP2.

HTTP Verbs (Methods)

To shuffle a document HTTP defines very few verbs limited to GET, PUT, DELETE and POST

  • GET is to fetch a document
  • PUT is to place a document
  • DELETE is to remove a document
  • And let’s not talk about POST just yet, let's circle back to it later.

A Dialogue

Typical HTTP dialogue

Above image depicts a simple dialogue between a user and some web application. Each message in this dialog is some sort of document being shuffled between the parties. Let's not worry about what's shuffled for now. It traces a workflow for the web-application.

Media Types

The data which is getting shuttled between server and the user follows certain agreed upon format so both parties can make sense of it. This formatting is dictated as Media Type. Let’s look at few examples.

  • text/plain
  • text/html
  • image/jpeg
  • application/xml
  • application/vnd.example+xml

Looks familiar, except may be the last one and I am not referring to the formatting.

Let’s read the last media type

  • application
    The media is for consumption by an application not humans

  • vnd
    This is vendor defined custom media type.

  • example
    This is a vendor specified media type name

  • +xml
    This media type is based on already known xml media type

Hyper-Media Types

We have seen media types now what does Hyper mean? User uses some application which talks to server on user’s behalf. The applications employed by users are referred as user-agents. The web browser is an example of user-agent. These user-agents understand and make sense of the media which flows to and from the server.

A Nudge

An application at the server must have an agenda as the conversation needs to have a closure. It must end either successfully or unsuccessfully. As with any dialogue user-agent and application on the server take turns to take next step as per the selections by the user-agent, in fact on behalf of the user. Web application is just simply nudging the user to have a closure on ongoing interaction.

Consider following interaction which depicts one request-reply.

Request

POST /orders
Host: example.org
Content-type: application/vnd.example+xml
Accept: application/vnd.example+xml

<order xmlns=“http://schemas.example.org/”>
  <item qty=“2”>ITEM-1</item>
</order>
Enter fullscreen mode Exit fullscreen mode

Reply

HTTP/1.1 201 Order created
Location: http://example.org/orders/1234
Content-Type:  application/vnd.example+xml

<order xmlns=“http://schemas.example.org/”>
  <item qty=“2”>ITEM-1</item>

  <next xmlns=“http://schemas.example.org/state-machine” 
    rel=“payment”
    uri=“http://example.org/orders/1234/payment”
  />

  <next xmlns=“http://schemas.example.org/state-machine”
    rel=“self”
    uri=“http://example.org/orders/1234”
  />
</order>
Enter fullscreen mode Exit fullscreen mode

Lets see what a browser, the user-agent, is sending as a request

  • Request Method
    One of HTTP verbs. In above request it is POST

  • Request URI
    URI of the resource. In above request it is /orders

  • Request Headers
    Accept lets the server know what media type user-agent is interested in

  • Content-Type
    Tells the server the data, being sent in the request, is in what format

  • Request Body
    And finally the data which is being sent to the server.

Let’s breakdown the response received from the server

  • User-agent coordination
    It talks about the consequence of the request

    • HTTP/1.1 201 Order created 201 is a status code telling the user-agent that request has been successfully executed and as a consequence a resource is being created. Now resource may sound too technical but in reality what it simply means a new object is created on the server.
    • Location: http://example.org/orders/1234 Location is a response header. It tells user-agent, if interested, it can follow the link in the header and redirect the user-agent to, in this case, to newly created resource.
    • Content-Type: application/vnd.example+xml It simply tells the user-agent that the data which is being sent to it is using application/vnd.example+xml media type.
  • The media

    or the data itself, rest of the bytes sent by the server are simply the actual data.

Now since the media type in this case is a custom one, it all depends on user-agent to make sense of it. In the example, user-agent may be smart enough to parse the response data and

  • Proceed to payments
  • Or simply interact with the newly created order resource.

In either case user-agent may use various HTTP verbs to interact with these resources.

Server is simply nudging the user, via the user-agent, to finish up the workflow of placing an order.

Since there are more than one outcomes of the decision taken by user-agent there are few possible interactions.

Say user wants to cancel the order

Request

DELETE /orders/1234
Host: example.org
Accept: application/vnd.example+xml
Enter fullscreen mode Exit fullscreen mode

Reply

HTTP/1.1 200 Okay
Content-Type:  application/vnd.example+xml


<order xmlns=“http://schemas.example.org/”>
  <item qty=“2”>ITEM-1</item>
  <status>Cancelled</status>
</order>
Enter fullscreen mode Exit fullscreen mode

Or user wants to proceed with payment

Request

POST /orders/1234/payment
Host: example.org
Content-Type:  application/vnd.example+xml
Accept: application/vnd.example+xml


<payment orderId=“1234” xmlns=“http://schemas.example.org/”></order>
Enter fullscreen mode Exit fullscreen mode

Reply

HTTP/1.1 201 Receipt generated
Location: http://example.org/orders/1234/receipt
Content-Type:  application/vnd.example+xml


<order xmlns=“http://schemas.example.org/”>
  <item qty=“2”>ITEM-1</item>

  <status>Not Ready</status>

  <next xmlns=“http://schemas.example.org/state-machine”
    rel=“self”
    uri=“http://example.org/orders/1234”
  />
</order>
Enter fullscreen mode Exit fullscreen mode

In all these interaction user is presented with enough coordination and the media itself to be able to proceed with successfully completing the workflow.

In case user-agent followed payments and it was successfully processed, as a side-effect receipt would be generated. Coordination data says so by 201 and gives uri to receipt as Location header. Client can now poll the self uri to see if order is ready.

Polling may sound stupid, but it is extremely effective with use of caches. These caches lessen the burden on the server as more often order is simply not ready immediately as it takes time to work with the order.

Curious case of POST

Important point is client has some part of application state locally cached or cached along the chain.

Now about the POST which was introduced in subtly different way. I am treating it post other verbs. Creations are mostly associated with POST, when in fact GET followed by a PUT can do the job.

For example creating a new order can be a series of operations as

  • GET last order
  • Create new order locally and Put back the new order

Then why POST?

Answer is quite obvious, I am merely making it explicit now. One important thing to note is that it’s multi-user application

Let's say new order-id is a function of number of orders in the collection + 1. Since it is multi-user application there can be more than one users simultaneously wanting to create new order. All these users ask, GET, for last order-id and all will end up with same. When these users PUT new order, last person to do so wins. Users come to know about failure or success only by GETting the order by order-id and verify it with locally created order. Failed users simply must redo the same sequence till they succeed. Rather if users just let the server know of their intention of creating a new order and the almighty server having wholistic application state would be in better know-how to process the requests. GET / PUT / DELETE or any sequence of it would not be sufficient to express such intent so POST. No wonder it sounds and feels alien.

Cache Consistency
Caches introduce a consistency problem though. Say, client has been told to cache a response for few seconds and meanwhile order is ready. Client, based on the cached response, decided to cancel the order. In this case client and server state has diverged. To make it safe to cancel, client can add a condition in the form of If-Match header. Server, in case where the condition does not match simply refuses to process the request telling client that the precondition to honor the request has failed. In the unlikely event, client did not send the If-Match headers, server can still report it as 409 conflict. Or 406 not acceptable.

Request

DELETE /orders/1234
Host: example.org
If-Match: status=Not Ready
Accept: application/vnd.example+xml
Enter fullscreen mode Exit fullscreen mode

Reply

HTTP/1.1 412 Precondition failed.
Content-Type:  application/vnd.example+xml


<order xmlns=“http://schemas.example.org/”>
  <item qty=“2”>ITEM-1</item>

  <status>Ready</status>

  <next xmlns=“http://schemas.example.org/state-machine”
    rel=“self”
    uri=“http://example.org/orders/1234”
  />
</order>
Enter fullscreen mode Exit fullscreen mode

Cancelled the order but Why?

As a business user, wouldn’t you want to know why was the order cancelled? These reasons are beyond api designing and delve more into business asks. If the api has implemented DELETE to cancel the order a typical interaction may look as bellow.

Request

DELETE /orders/1234
Host: example.org
Accept: application/vnd.example+xml
Enter fullscreen mode Exit fullscreen mode

Reply

HTTP/1.1 200 Okay
Content-Type:  application/vnd.example+xml


<order xmlns=“http://schemas.example.org/”>
  <item qty=“2”>ITEM-1</item>
  <status>Cancelled</status>
</order>
Enter fullscreen mode Exit fullscreen mode

Problem with above interaction is that DELETE does not accept any body. How can we solve this?

Let’s try using POST

Request

POST /orders
Host: example.org
Accept: application/vnd.example+xml


OrderId=14
&Operation=Cancel
&Reason=Just to annoy you
Enter fullscreen mode Exit fullscreen mode

Reply

HTTP/1.1 200 Okay
Content-Type:  application/vnd.example+xml


<order xmlns=“http://schemas.example.org/”>
  <item qty=“2”>ITEM-1</item>
  <status>Cancelled</status>
</order>
Enter fullscreen mode Exit fullscreen mode

This does not even look right from the get-go. Feels like we are performing a business operation specified by Operation. Operation is now a keyword which needs to be handled beyond limited set of HTTP verbs. This way the api would keep on adding more and more verbs which need special handling.

Can we do better?

Once we start constraining ourselves to only four verbs which HTTP provides, we can definitely do better. Lets revisit the same problem with the api designing constraints.

Do not forget that you have limitless supply of nouns. And we are naturally wired to produce random names all the time.

Request

POST /CancelledOrders
Host: example.org
Accept: application/vnd.example+xml


OrderId=14
&Reason=Just to annoy you
Enter fullscreen mode Exit fullscreen mode

Reply

HTTP/1.1 200 Okay
Content-Type:  application/vnd.example+xml

<order xmlns=“http://shemas.example.org/”>
  <item qty=“2”>ITEM-1</item>
  <status>Cancelled</status>
</order>
Enter fullscreen mode Exit fullscreen mode

Though it sounds simple to come up with names, it is very tedious. No wonder diehard proponent of ReST, Mark Baker or Tim-Berners Lee, suggested to not add any meaning to URI and making them opaque.

Opaque URIs relieves us from the need to produce new nouns every now and then at the cost of comprehensible api. This also makes documenting the api difficult as api end-points are decided at run-time. Also, security and relevant features can only be useful once the opaque uris are established.

Roy Fielding, on the other hand, has an opinion that uris are meaningful and it sort of makes more sense. Opaque URIs are for lazy folks.

Representations

Now lets focus on Re of ReST. It’s not the resource which is sent to the user-agent. User-agent asks for specific type of media it can understand. Let’s consider following HTTP interaction.

Request

GET /orders/1234
Host: example.org
Accept: text/plain
Enter fullscreen mode Exit fullscreen mode

Reply

If the web-application does not understand the requested media type, it can simply state this inability as

HTTP/1.1 415 Unsupported Media Type.
Enter fullscreen mode Exit fullscreen mode

Or if it does indeed supports the requested media type

HTTP/1.1 200 Okay
Content-Type: text/plain


Order 1234 is not ready
Enter fullscreen mode Exit fullscreen mode

If user-agent requests for media types which are supported by the web application, it can send the resource data fitting the requested media type. Here it is the same resource but formatted as different media types and all these variations of same resource is called representations. Since the data which flows between the user-agent and the server can be different representations, it is known as Representational State Transfer or simply ReST.

Further Reading
Roy Fielding’s dissertation

Top comments (0)