Openapi-specification: Proposal: Hypermedia and linking between resources

Created on 10 May 2016 · 18Comments · Source: OAI/OpenAPI-Specification

Bitbucket's API uses HAL-style links to make things more discoverable. An example of this is how a repository object embeds a link to its list of pull request and a pull request object links to its comments and reviewers.

Embedding links has a number of disadvantages though.

Limitations of Custom Links in Schema Objects

As our API grows, so does the number of associated resources and by extension the number of links embedded in every response, which has a compute and bandwidth cost. The extensive list embedded in our repository object reflects this:

  "links": {
    "watchers": {
      "href": "https://api.bitbucket.org/2.0/repositories/evzijst/interruptingcow/watchers"
    },
    "branches": {
      "href": "https://api.bitbucket.org/2.0/repositories/evzijst/interruptingcow/refs/branches"
    },
    "tags": {
      "href": "https://api.bitbucket.org/2.0/repositories/evzijst/interruptingcow/refs/tags"
    },
    "commits": {
      "href": "https://api.bitbucket.org/2.0/repositories/evzijst/interruptingcow/commits"
    },
    "clone": [
      {
        "href": "https://bitbucket.org/evzijst/interruptingcow.git",
        "name": "https"
      },
      {
        "href": "ssh://[email protected]/evzijst/interruptingcow.git",
        "name": "ssh"
      }
    ],
    "self": {
      "href": "https://api.bitbucket.org/2.0/repositories/evzijst/interruptingcow"
    },
    "html": {
      "href": "https://bitbucket.org/evzijst/interruptingcow"
    },
    "avatar": {
      "href": "https://bitbucket.org/evzijst/interruptingcow/avatar/32/"
    },
    "hooks": {
      "href": "https://api.bitbucket.org/2.0/repositories/evzijst/interruptingcow/hooks"
    },
    "forks": {
      "href": "https://api.bitbucket.org/2.0/repositories/evzijst/interruptingcow/forks"
    },
    "downloads": {
      "href": "https://api.bitbucket.org/2.0/repositories/evzijst/interruptingcow/downloads"
    },
    "pullrequests": {
      "href": "https://api.bitbucket.org/2.0/repositories/evzijst/interruptingcow/pullrequests"
    }
  }

Another downside is that the purpose of these links is invisible to Swagger clients. This means that even though a repository links to its pull requests, a Swagger code generator would not automatically add a method .getPullRequests() to the generated Repository class.

Adding Hyper Linking to Open API

Support to express relationships between resources could be added to Open API in a number of ways:

Formalize the notation of links embedded in schema objects
Instead of link standardization, declare the resource relationships

Option 1 would make it possible to declare the shape and purpose of the links in the Open API schema so that code generators could take advantage of them at runtime. However, it would not reduce the overhead of embedding.

Option 2 is akin to the approach followed by the JSON Hyper-Schema draft. It would eliminate the need to embed any link information at runtime, while still allowing code generators to be aware of the resource relationships and generate appropriate functions.

Example

Following is a more thorough model showing the relationships between repositories, users and pull requests in Bitbucket's API:

{
  "paths": {
    "/2.0/users/{username}": {
      "parameters": [
        {
          "name": "username",
          "type": "string",
          "in": "path"
        }
      ],
      "get": {
        "operationId": "getUserByName"
      },
      "responses": {
        "200": {
          "Schema": {
            "$ref": "#definitions/user"
          }
        }
      }
    },
    "/2.0/repositories/{username}/{slug}": {
      "parameters": [
        {
          "name": "username",
          "type": "string",
          "in": "path"
        },
        {
          "name": "slug",
          "type": "string",
          "in": "path"
        }
      ],
      "get": {
        "operationId": "getRepository",
        "responses": {
          "200": {
            "schema": {
              "$ref": "#definitions/repository"
            }
          }
        }
      }
    },
    "/2.0/repositories/{username}/{slug}/pullrequests": {
      "parameters": [
        {
          "name": "username",
          "type": "string",
          "in": "path"
        },
        {
          "name": "slug",
          "type": "string",
          "in": "path"
        },
        {
          "name": "state",
          "type": "string",
          "enum": ["open", "merged", "declined"],
          "in": "query"
        }
      ],
      "get": {
        "operationId": "getPullRequestsByRepository",
        "responses": {
          "200": {
            "schema": {
              "type": "array",
              "items": {
                "$ref": "#definitions/pullrequest"
              }
            }
          }
        }
      }
    },
    "/2.0/repositories/{username}/{slug}/pullrequests/{pid}": {
      "parameters": [
        {
          "name": "username",
          "type": "string",
          "in": "path"
        },
        {
          "name": "slug",
          "type": "string",
          "in": "path"
        },
        {
          "name": "pid",
          "type": "string",
          "in": "path"
        }
      ],
      "get": {
        "operationId": "getPullRequestsById",
        "responses": {
          "200": {
            "schema": {
              "$ref": "#definitions/pullrequest"
            }
          }
        }
      }
    },
    "/2.0/repositories/{username}/{slug}/pullrequests/{pid}/merge": {
      "parameters": [
        {
          "name": "username",
          "type": "string",
          "in": "path"
        },
        {
          "name": "slug",
          "type": "string",
          "in": "path"
        },
        {
          "name": "pid",
          "type": "string",
          "in": "path"
        }
      ],
      "post": {
        "operationId": "mergePullRequest",
        "responses": {
          "204": {}
        }
      }
    }
  },
  "definitions": {
    "user": {
      "type": "object",
      "properties": {
        "username": {
          "type": "string"
        },
        "uuid": {
          "type": "string"
        }
      },
      "links": [
        {
          "rel": "repositories",
          "href": {
            "operation": "getRepositoriesByOwner",
            "parameters": {
              "username": "{username}"
            }
          }
        }
      ]
    },
    "repository": {
      "type": "object",
      "properties": {
        "slug": {
          "type": "string",
        },
        "owner": {
          "$ref": "#definitions/user"
        }
      },
      "links": [
        {
          "rel": "self",
          "href": {
            "operation": "getRepository",
            "params": {
              "username": "{owner/username}",
              "slug": "{slug}",
            }
          }
        },
        {
          "rel": "pullrequests",
          "href": {
            "operation": "getPullRequestsByRepository",
            "params": {
              "username": "{owner/username}",
              "slug": "{slug}",
              "state": "open"
            }
          }
        }
      ]
    },
    "pullrequest": {
      "type": "object",
      "properties": {
        "id": {
          "type": "integer"
        },
        "title": {
          "type": "string"
        },
        "repository": {
          "$ref": "#definitions/repository"
        },
        "author": {
          "$ref": "#definitions/user"
        }
      },
      "links": [
        {
          "rel": "self",
          "href": {
            "operation": "getPullRequestById",
            "parameters": {
              "username": "{repository/owner/username}",
              "slug": "{repository/slug}",
              "pid": "{id}"
            }
          }
        },
        {
          "rel": "merge",
          "href": {
            "operation": "mergePullRequest",
            "parameters": {
              "username": "{repository/owner/username}",
              "slug": "{repository/slug}",
              "pid": "{id}"
            }
          }
        }
      ]
    }
  }
}

Source

erikvanzijst

👍7

Most helpful comment

Nice proposal. I agree dynamic runtime links are a better option. That is central to our HATEOAS design: the API provides links (affordances) only if an operation is available to the (authenticated) client. Not allowed to delete a resource? Then the API does not return the rel : delete link. At the beginning of a paginated collection? Then omit the rel : prev link.

I like the ability to refer to an operation by operation id. As noted in other comments, however, this alone does not allow for links to resources outside of the API.

I suggest also a top-level link-strategy annotation in OAS which tells what link representation the API uses (HAL, Collection+JSON, Atom, etc.). This is probably just a hint, but tools like Swagger UI could use this to allow navigating the resources via parsing the links in the responses and matching them to the operations. This would promote HATEOAS understanding of hypermedia APIs, rather than low-level method+url tight coupling. See my comment on #577. (So I guess I disagree with @ePaul because I think embedding the link relations in OAS _is_ the way to go: it reveals the intended use of the API - that is, the client should be looking for links and the embedded href values in the links, in order to act on responses, rather than hard-coded transitions based on static operation+path+parameters (which is all one can do with Swagger 2.0).

Please elaborate on what params is - are they parameters for the link, or are they bindings that are substituted for the parameters on the target link? If so, I think a more appropriate name would be args or arguments (as described on SO)

Finally, some of the ideas in #445 might be useful. In our APIs, we repeat common sets of links in multiple places - for example, most simple resources have rel : self, rel : update and rel : delete -- that is a (parameterized) set of links, usually differing only in the request or response content type; collections all share next, prev, first and last links, and so on. So, as #445 proposes reusable parameter groups, I think reusable link groups may also be useful. In this case, however, the reference to other operations would be more implicit via operation+path rather than explicit via an operation id.

DavidBiesack on 11 May 2016

👍2

All 18 comments

Parent issue: #586

erikvanzijst on 10 May 2016

This is very similar to my proposal. My goal was that I was hoping to re-use existing constructs and introduce as few new concepts as possible.

dilipkrish on 11 May 2016

Yeah, there are clearly a lot of overlapping ideas.

However, some of the other discussions are around parsing and interpreting links embedded in response object at runtime. The suggestion here aims to make HATEOAS/HAL redundant and slimming down response documents and have all the necessary information in the schema.

erikvanzijst on 11 May 2016

Potential issues:

After discriminator, this is yet another extension to Swagger's SchemaObject not present in JSON Schema
No real attention has been paid to links to external URLs
It's unclear how well this could support pagination links

Pagination is of particular interest, as it is a concept notoriously ill-defined and lacking in Swagger. APIs often have custom ways of dealing with it. Two common approach stand out:

Wrapping the response array in an envelope containing things like total size, page number, next and previous links (Bitbucket)
Returning a pure array and moving the pagination properties and links to response headers (GitHub)

In case of the former, the response contains the page number and so it might be possible to derive the pagination links from those values, but in case of the latter, the pagination properties lie beyond the schema.

erikvanzijst on 11 May 2016

👍1

I guess the Bitbucket API is an extreme example, where all the links are just predictably built from the statically visible API, and not changing depending on other information (like status of resources). For cases like those I guess having the links here helps just for exploring the API, but they are not really used productively, and then putting them into the swagger API description instead might be more useful.

In general I think "get rid of HATEOAS by putting all the relevant links inside the OpenAPI schema" is not the way to go, we still need links, and a way to describe how to use them. (This means, this is orthogonal, not really an alternative to #577.)

ePaul on 11 May 2016

I like the ability to refer to an operation by operation id. As noted in other comments, however, this alone does not allow for links to resources outside of the API.

DavidBiesack on 11 May 2016

👍2

So first off, this is great, and I think a very good proposal. I have a couple thoughts that we can cover on the TDC call:

Links may be out of place in a model definition. I'm not sure yet but it may be wise to make them part of the response definition rather than the model schema. The benefit of being in the schema is you are effectively helping make the model more actionable, and without some of the assumptions that come with other model-driven approaches. The reference to the operationId in the model will be difficult for people centralizing their models for reuse across an organization.
Do we want to handle references outside the definition? If so, we'll need a different locator than operationId. I'm not convinced we do, but if so, we'd want to provide both a full URL and a schema reference. The operationId may be troublesome as people typically use it for code generation, and often will "change it" to look nice in client libraries.
I do think this helps with pagination, because the response object can have a schema-specified cursor ID/locator and a way to navigate via links. I think it doesn't help with modeling the response object itself, but that may be out of scope for this proposal
I would love to see if there's some way to gracefully handle access to the links. If a user can access the read methods but not write, is there some way that it can be make clear in the response object?

Overall this is very good. I'm looking forward to talking through open questions and getting it merged 👍

fehguy on 20 May 2016

👍 to getting this merged! Couple of changes, that I'd like to suggest based on a similar proposal I made.

From a tooling perspective I've seen that relying on operationId problematic. I would rather have a reference to path/http-method and may be even a media type combination to dereference the operation.
Secondly, _assuming a link represents a reference to an operation_, I'd rather link to a path explicitly rather than redefine that inline the parameters etc. within a link.
Lastly I'd like for links to be a top level construct for documentation and not be tied to/nested only within models

for e.g. In the current proposal

"/2.0/repositories/{username}/{slug}/pullrequests/{pid}": {
      "parameters": [
        {
          "name": "username",
          "type": "string",
          "in": "path"
        },
        {
          "name": "slug",
          "type": "string",
          "in": "path"
        },
        {
          "name": "pid",
          "type": "string",
          "in": "path"
        }
      ],
      "get": {
        "operationId": "getPullRequestsById",
        "responses": {
          "200": {
            "schema": {
              "$ref": "#definitions/pullrequest"
            }
          }
        }
      }
    },
...
  "pullrequest": {
      "links": [ <-- Make these top level
        { 
          "rel": "self", 
          "href": {
            "operation": "getPullRequestById",
            "parameters": { //<-- eliminate redefinition of parameters ^  above ^
              "username": "{repository/owner/username}",
              "slug": "{repository/slug}",
              "pid": "{id}"
            }
          }
        },

I _propose_ that we make links first class citizens of OAS.

{
    "links": {
        "pullRequests": {
            "method": "GET",
            "contentType": "application/json",
            "$pathRef": "/2.0/repositories/{username}/{slug}/pullrequests/{pid}"
        }
    },
    "paths": {
        "/2.0/repositories/{username}/{slug}/pullrequests/{pid}": {
            "get": {
                "description": "",
                "operationId": "getPullRequestsById",
                "parameters": [],
                "responses": {}
            }
        }
    },
    "definitions": {
       "pullrequest": {
         ....
          "links": { 
               "self" { <-- link to global link definition 
                  "$ref" : "#links/pullRequests"
               },
               "pullRequests": { <-- Inlined link definition
                    "method": "GET",
                    "contentType": "application/json",
                    "$pathRef": "/2.0/repositories/{username}/{slug}/pullrequests/{pid}"
               },
               ....
          }
       } 
}

Note that here the pull request node has two links, one that links to the global link definitions and one that is inlined. One caveat with this is approach is that that its got _refs_ all over the place for e.g. $pathRef, and `$ref. We need to think through these to be consistent with how the rest of the specification is defined with respect to references.

The other nice thing about this approach is also that we can link to definitions of link relations that may not be described by OAS even. For eg. a given link relationship might already be defined by IANA.

dilipkrish on 20 May 2016

👍1

Copying @mpnally.

whitlockjc on 21 May 2016

I think one of the challenges we have here is that I am seeing two completely distinct goals being discussed.

1) I believe @erikvanzijst 's original's proposal is trying to define the relationships between resources so that it is no longer necessary to include the links in response bodies. This is a trade off that enables smaller and simpler response bodies at the cost of being able to dynamically define links. When links are used to just represent relationships within a data model then static links are often sufficient.

2) My understanding is that @dilipkrish 's suggestion is an attempt to describe the metadata needed to follow a link, so that when a link is included in a response body, a client can have the necessary knowledge to dereference the link. This provides a static description of the link interaction but doesn't necessarily describe in advance where links may be available.

It is possible that we can find a solution that will support both scenarios, but it may be the case that the solution is more complex than necessary than it would be if only supporting one.

darrelmiller on 23 May 2016

For "metadata needed to follow a link", a proposal is in #577.

ePaul on 23 May 2016

👍1

@ePaul I re-read though your proposal for the notion of an interface. I realize now that your interface is effectively what I want to be able to do, I just wouldn't call it interface. I actually don't make the distinction you do between link relation type and interface. To use your example, I see sibling and partner as just a subclass of the person type. I also realize that my perspective is not very common. I also think your analogy of it being a typed function is a good one. I've used the term delegate before as that is what is used in .net languages to describe a function signature.

darrelmiller on 23 May 2016

@darrelmiller IMO I do think they compliment each other. I'm certainly not opposed to have the resources describe the relationships to other links, in fact its useful if we know that a resource always comes with certain relationships we already know about at design time.

To sum up, what I was suggesting was:

make links a first class construct (a peer of path/definitions in OAS). Similar to the way we share model schema, we should be able to share link semantics as well and use externally defined link semantics such as those defined by IANA.
in 99.9% of the use cases that I can think of, links always dereference a http operation, IOW an OAS "Path". Let us make this construct so that there is no duplication in parameters definitions, templates etc. (example illustrated ☝️ ) and either defer to the _Path_ fill in the blanks.
making links deference paths using a combination of _url, http method, media type_ rather than operationId.
When resources define links with other resources, as suggested by @erikvanzijst, allow the relationship to be described inline or reference a global _link definition_ within the OAS document or external document. The media type of the link description can determine if the description is an OAS specification or ALPS document or a link to an IANA link relation description.

The one caveat I can think of when describing resources relationships is, how does one define the resource model with its links and relationships in a media type agnostic way. For e.g. HAL refers to links as _links, JsonApi, siren etc. have different ways to describe the same concept. I suspect this is the complexity that @darrelmiller is referring to.

dilipkrish on 24 May 2016

At Medallia we are using swagger to define our internal APIs with a spec-first approach (where the spec serves as a contract between FE and BE teams and both work on the implementations in parallel). We wanted to include hypermedia information in the spec for two main reasons:

Be able to list all the potential links a given resource would provide.
Be able to define the schema you would obtain when you hit a given link.

Here's an example of the "x-links" extension we're using:

  user:
    type: object
    properties:
      id:
        type: string
      username:
        type: string
    ...
    x-links:
      role:
        schema: '#/definitions/role'

It mainly adds 1. and 2. in a custom extension (x-links) to an object definition.
For now we have a preprocessor that expands this extension to a vanilla swagger spec using the HAL hypermedia format. The schema is only added as documentation in the final spec but it's also leveraged at code generation time.

We considered adding a reference to a path operationId to be able to auto-generate the link but we decided it was taking the approach too far and not a requirement for now.

So in our case we tie this information to the resource definition, mainly for the same reason @dilipkrish describes.

We wanted to share our current solution for hypermedia support, similar to the one described here, to give you guys another data point of possible use cases. We really look forward to seeing native support in the next version.

erdody on 27 May 2016

👍1

@erdody Interesting approach. At Apigee we have used a somewhat analogous approach in a project called Rapier. Rather than extend OpenAPI, Rapier just extends JSON Schema, and then generates OpenAPI from that. Our current generator uses the JSON schema definition unaltered in the generated OpenAPI, so it would be the author's responsibility to lay out the JSON Schema in HAL format (which is doable, if a bit tedious). Alternatively, you could extend the generator to produce a HAL layout (or Siren or other) from a simpler schema definition.

mpnally on 27 May 2016

@fehguy you mentioned "the TDC call" -- what/when is that? I don't recall seeing any announcements/emails or subscriptions for TDC calls

DavidBiesack on 1 Jun 2016

to me, @mpnally's approach sounds reasonable: instead of trying to solve everything in OpenAPI, separate the resource-centric view of it, and add a hypermedia layer on top of it. in order to support proper hypermedia use cases, those links would have to be discoverable at runtime (instead of being static). if only static links are supported, that seems to be an edge case that doesn't support more ambitious affordance-based hypermedia API designs.