April 4, 2012

Take your REST API to the next level with HATEOAS

A pragmatic approach to understanding HATEOAS

HATEOAS, or "Hypermedia As The Engine Of Application State", is a lot about the discoverability in a RESTful API. And it is also the part that most REST APIs misses out on. This is sad because there are a lot of benefits to reap if you take your API to the next level. Especially for the developers using your API.

Some of the benefits of using HATEOAS include simpler clients, simplified authorization control on the client side, and making your API self-documenting. If you are all new to the concept then Martin Fowler's article on the RESTful maturity model is a good start. To summarize it, the model defines four levels of maturity of the RESTfulness of an API:
  1. One URI, one HTTP method
  2. Multiple resources instead of a single endpoint
  3. Use of HTTP verbs
  4. Discoverability by using hypermedia
This model can help us understand the ideas behind REST and to fully experience the "glory of REST" the API must pass all four levels. The last level is what HATEOAS is all about and I will spend the rest of this post to discuss a somewhat pragmatic approach to how you can bring HATEOAS to your REST API.* Before we move on it is also worth mentioning another good article about the topic written by Rickard Öberg that you also might want to check out.

Discoverability

What the discoverability of your REST API means in practice is that a developer using your API should not need to refer to some documentation to place calls to random URLs in order to execute the use cases in your API, nor should she need to have the URLs hardcoded in the client application. Let us use a simple example to illustrate this.

Imagine we have a simple service for trading equities. In this service an order is a resource that we can obtain by doing a GET request to http://foo.bar/equities/orders/12345. The response contains details about the specified order and it would also contain information about what actions we can perform from where we are. Those actions are specified as links in the response. In the JSON response below we see that there are two links available. One is specifying the current resource, the self link, the other one is specifying the possible action we can perform on the order, namely delete it. The delete link also specifies the HTTP method to use when deleting an order. In the API for this trading service, the GET method is considered to be the default HTTP method for links so it is omitted in the self link.
{
   // ... various order data ...

  "links" : [ {
    "rel" : "self",
    "href" : "http://foo.bar/trading/equities/orders/12345"
  }, {
    "rel" : "delete",
    "href" : "http://foo.bar/trading/equities/orders/12345",
    "method" : "DELETE"
  } ]
}
This is pretty straightforward and not very hard to grasp but how about when the link that the client application should follow depends on some user input? Well, we simply parameterize the link. For example, let say there is a business requirement to have a form presented to the user and let her enter a random order id so she can delete it. Then a delete link would look like this:
{
  "rel" : "delete",
  "href" : "http://foo.bar/trading/equities/orders/{orderId}",
  "method" : "DELETE"
}
Note the URI template style of the href attribute. The client will use the template to substitute {orderId} for the order id entered by the user.

The same idea with URI templates is also applicable for query parameters. Consider this link for performing a search for equities:
{
  "rel" : "search",
  "href" : "http://foo.bar/trading/equities/search?q={searchQuery}"
}
The client will substitute the {searchQuery} with the query typed in by the user.

It is important to note that the client is not constructing any URLs from scratch. It is still discovering the URLs via the provided links and the placeholders are there to tell the client where it should put the user input.

I should mention that an alternative to using link elements would be to use HTTP Link headers. However, Link headers are not as widely adopted and it would be easier for most clients and libraries to use link elements. So in the spirit of keeping things pragmatic I would recommend using link elements.

What about POSTing?

All examples we have seen so far has all been fairly simple because all information passed to the server can be put in the URL for HTTP methods like GET and DELETE. But in our imaginary trading service we also have the ability to enter new orders. An order is entered for a specific equity with id 123456 by POSTing relevant data to the URL http://foo.bar/trading/equities/123456. Since we are looking for discoverability in our API we need a way for the client to figure out what data it should send in a POST request and how that data should be formatted so the server can understand it.

First off, we need to provide some sort of template describing the data that can be POSTed, and secondly, we need a way to tell what template goes with what link. The way we do this is by introducing the concept of commands. We extend the links with a command attribute that specifies the name of the command that should be used with that link. The command itself simply contains the template. The link for entering a new order would look like this after applying a command to it:
{
  "rel" : "add",
  "href" : "http://foo.bar/trading/equities/123456",
  "method" : "POST",
  "command" : "addOrder"
}
And the addOrder command will look like this:
{
  "commandName" : "addOrder",
  "template" : {
    // Some template
  }
}
Now that we have the command structure in place we need to choose a format for the template that will describe the format of the POSTed data. If our API was using HTML then this could be accomplished by returning a HTML-form with all the required input fields etc. However, our API is using JSON so we cannot do that.

JSON Schema to the rescue. By using a JSON Schema as our template we can describe the JSON data that should be POSTed and the client can use/parse this schema on the fly and present a form to the user that allows her to input relevant fields. The JSON Schema can, for example, also be used to describe what possible values that are valid for fields that have limited input options. For example, an order in our trading service can only be of one out of three different types. Fill-and-Kill, Fill-or-Kill, and Good-'Til-Cancelled. The schema let us specify the possible values and the client could present these to the user as a drop-down.

To put this in context, here is how we would go about to enter a new order. First we need to get to the instrument (equity) we want to trade in. This is done by performing a search or viewing a list of all instruments. From the search hit we get a link http://foo.bar/trading/equities/123456 that we can perform a GET on to get the details of the instrument. (For simplicity we assume that one instrument only have one orderbook that you can place an order in, in reality there are several orderbooks but that is more domain detail than we need for this example). The GET request would then return information about the instrument and a link to the URL for entering a new order:
{
  "instrument" : {
    "instrumentKey" : {
      "mic" : "XNAS",
      "isin" : "US0378331005",
      "currency" : "USD"
    },
    "orderbookId" : 123456,
    "longName" : "Apple Inc.",
    "shortName" : "AAPL"
  },
  "links" : [ {
    "rel" : "self",
    "href" : "http://foo.bar/trading/equities/123456"
  }, {
    "rel" : "add",
    "href" : "http://foo.bar/trading/equities/123456",
    "method" : "POST",
    "command" : "addOrder"
  } ],
  "commands" : [ {
    "commandName" : "addOrder",
    "template" : {
      "type" : "object",
      "properties" : {
        "size" : {
          "type" : "number"
        },
        "openSize" : {
          "type" : "number"
        },
        "price" : {
          "type" : "number"
        },
        "marketPriceOrder" : {
          "type" : "boolean"
        },
        "side" : {
          "type" : "string",
          "enum" : [ "BUY", "SELL" ]
        },
        "type" : {
          "type" : "string",
          "enum" : [ "FaK", "FoK", "GTC" ]
        },
        "expireDate" : {
          "type" : "string"
        }
      }
    }
  } ]
}
The client would then examine the template given in the addOrder command, present a way for the user to enter necessary input and then POST that entered data to the URL specified by the add link. The POSTed data would look like this:
{
  "size" : 250.0,
  "openSize" : 250.0,
  "price" : 618.63,
  "side" : "BUY",
  "type" : "FaK",
  "expireDate" : "2012-04-10T00:00:00.000+0000"
}
So with the use of commands and templates we have now provided a way for the client to discover how to perform POST requests to the API.

One URL to rule them all

What we have not discussed so far is the starting point, or base URL, of our API. The base URL of the API should ideally be the only URL the client needs to know in advance. The rest of the API should be discovered via the links provided by the service.

The starting point of our example API is the URL http://foo.bar/trading/equities/. If we would do a GET request on that URL we would get a collection of links back that describes all possible interactions. The response would look something like this:
{
  "links" : [ {
    "rel" : "self",
    "href" : "http://foo.bar/trading/equities/"
  }, {
    "rel" : "list",
    "href" : "http://foo.bar/trading/equities"
  }, {
    "rel" : "search",
    "href" : "http://foo.bar/trading/equities/search?q={searchQuery}"
  }, {
    "rel" : "orders",
    "href" : "http://foo.bar/trading/equities/orders/"
  } ]
}
Notice that the only thing that differs between the self and list links is the trailing slash on the self link's URL. This is a subtle but important difference. A GET request on a URL that ends with a slash will return the possible links for that URL. It will be the "index page" if you will. On the other hand, a URL without a trailing slash would be treated as a resource. In this case the list link would return all the equities. As you can see, the orders link also has a trailing slash so we can use that URL to continue to discover other parts of the API. This is the foundation of the discoverability in our API.

Summary

So now we have looked at some basic concepts and tools that we can use to bring HATEOAS to our REST API. To summarize the highlights:

  • Use of links to support discoverability of the API
  • URI templates for showing the client where to put user input
  • The concept of commands to convey information about posted data
  • Using "trailing slashes" for URLs that describe the API

I have tried to keep these examples as simple and close to the "real world" as possible and hopefully you can take some of these concepts and apply them to your own APIs. Or you can use them as food for thought and expanding on them to suite your needs better.



----------------------
*: Actually, your API is not a true REST API until it is on level 4. Before that, it is a HTTP-based API. This is also what Roy Fielding thinks, but that is a different discussion and we are trying to stay pragmatic here so I will use the term REST API in this post to refer to any API that have reached at least level 3.

2 comments:

  1. Do you think we've got it right?

    We're missing the rel links for GET/POST etc. though.

    http://www.surevoip.co.uk/support/wiki/api_documentation

    ReplyDelete
    Replies
    1. Hi Gavin!

      I didn't have time to look at it in detail but your API looks very thorough and promising! I also like the webhooks plans you guys have.

      Just out of curiosity, may I ask what tools/technology stack you have used to develop this? If you don't mind me asking and if you can share that kind of information.

      Delete