Openapi-specification: version 3.0: additional formats

Created on 21 Mar 2016 · 25Comments · Source: OAI/OpenAPI-Specification

As there are several issues proposing new formats, here a list of the possible future full picture

| Common Name | Type | Format | Comments |
| --- | --- | --- | --- |
| octet/(unsigned) byte | integer | uint8 | new: unsigned 8 bits |
| signed byte | integer | int8 | new: signed 8 bits |
| short | integer | int16 | new: signed 16 bits |
| integer | integer | int32 | signed 32 bits |
| long | integer | int64 | signed 64 bits |
| big integer | integer | | |
| float/single | number | float | |
| double | number | double | |
| decimal | number | decimal | new: decimal floating-point number, recipient-side internal representation as a binary floating-point number may lead to rounding errors |
| big decimal | number | | |
| string | string | | |
| byte | string | byte | base64 encoded characters |
| url-safe binary | string | base64url | new: base64url encoded characters - #606 |
| binary | string | binary | any sequence of octets |
| boolean | boolean | | |
| date | string | date | As defined by full-date - RFC3339 |
| dateTime | string | date-time | As defined by date-time - RFC3339 |
| time (of day) | string | time | new: As defined by partial-time - RFC3339 - #358 |
| duration | string | duration | new: As defined by xs:dayTimeDuration - XML Schema 1.1 - #359 |
| uuid | string | uuid | new: Universally Unique Identifier (UUID) RFC4122 |
| password | string | password | Used to hint UIs the input needs to be obscured. |

Documentation OpenAPI.Next Proposal Schema

Source

ralfhandl

👍14

Most helpful comment

@whitlockjc yes, and with a fallback to primitive types, if not supported. I don't think we should be inventing types.

In general, if we specify a format, we should dictate exactly what that is supposed to be. If a user expects a different behavior from a defined format, well, that violates the spec.

So... to make this concrete:

type: string
format: uuid

Should have a very specific format defined in the spec, specifically what @DavidBiesack mentioned:

uuid = 8HEXDIG "-" 4HEXDIG "-" 4HEXDIG "-" 4HEXDIG "-" 12HEXDIG

type: string
format: date-time

quite specifically says RFC3339 format (https://github.com/OAI/OpenAPI-Specification/blob/master/versions/2.0.md#data-types)

If I want to invent tonys-date-time then pretty much no tools will know what the heck to do with it, and would fall back to type: string.

fehguy on 23 Mar 2016

👍8

All 25 comments

:+1:

amarzavery on 21 Mar 2016

The format modifier is optional. Please add integer with no format -- not confined to 32/64 bit (a.k.a. BigInteger) and also number without format which is also not constrained to floating point 32/64 bit (a.k.a. BigDecimal).

DavidBiesack on 21 Mar 2016

Rather than tightly couple to uuid format, I suggest just a generic id format that means _an opaque identifier string_. Whether an ID is a UUID, a hash, a databse primary key, or something else seems more like an implementation detail that should be hidden from the API specification. Many API's make extensive use of id parameters/members but do not overly-specify them as UUID strings (often, because they are not UUIDs - look at bit.ly hashes for example.)

Slightly related, I would prefer not overloading format with what is really _role_ or _attribute_. While Swagger 2.0 has password, it is not a _format_ but a _role_ that is orthogonal to format. Ditto for other PII like social security number, government id number, etc. (A more generic role for these might be masked which is a UI hint.)

Thus, while uuid is a format, id (if it were to replace uuid) a _role_, not a format.

DavidBiesack on 21 Mar 2016

@DavidBiesack I actually intended uuid as a format, i.e. a string that has the pattern

uuid = 8HEXDIG "-" 4HEXDIG "-" 4HEXDIG "-" 4HEXDIG "-" 12HEXDIG

No objections to adding the concept of a _role_ and an id role. Still would require a format uuid.

ralfhandl on 22 Mar 2016

@ralfhandl Such things should be added into JSON Schema standard or extracted into a separate spec.
I can imagine the situation when JSON Schema Draft 5 add format with the same name but different validation rules.
Another problem,for example, to validate your DB or user input on client-side. For that purpose you would use pure JSON Schema and it would be strange to use OpenAPI-specific types over there.
And last one, tooling support for such formats will be limited to only OpenAPI tools.

I'm not against extending formats I just say that spec should reference some external doc and not to define them internally.
IMHO, OpenAPI should describe API-specific stuff and reuse existing data validation specs.

IvanGoncharov on 22 Mar 2016

@IvanGoncharov Looking at the list of formats supported by Swagger 2.0 we only find one format that is defined by JSON Schema: date-time. The proposed new formats are in line with the existing swagger-specific formats, so adding them would not enter new ground.

Formats are an explicit extension point of JSON Schema for semantic validation, and the OpenAPI Specification could be one of the "authoritative resources that accurately describes interoperable semantic validation".

I'm not aware of other external documents describing formats for semantic validation in JSON Schema.

ralfhandl on 22 Mar 2016

I oppose requiring id strings to be UUID. (I'm not sure if that was what you meant by _"Still would require a format uuid."_)

As noted, I also think it is _not_ a good idea to expose internal implementation details such as UUID format in an API definition. id strings (path parameters, query parameters, fields) should be no more than opaque string IDs. Over-specifying them as UUID is fragile and does not allow for non-breaking changes if the underlying implementation changes. Again, look at the bit.ly API which uses hash ID strings, not UUIDs.

DavidBiesack on 22 Mar 2016

I don't require id strings to be UUIDs, I only require uuid strings to be UUIDs. I see the string format uuid similar to the string format date-time - as a validation rule that restricts the allowed / possible values of a string parameter or property. It tells the client that some string values will be accepted, and others will be refused.

As you pointed out above the concept of an "id" is a role and not a format. So we should introduce this new concept in a new, specific way and not mix it up with format.

Could it be that your concept of an "id" is related to the concept of a "primary key"? See #587

ralfhandl on 22 Mar 2016

What you expect to gain by formally supporting more formats? I realize the format could play a role in code generation, mock data generation, validation and potentially more so I figured I'd ask. I also ask because while writing Swagger tooling in the past, custom formats were easy to support without OpenAPI/Swagger being involved, especially since OpenAPI/Swagger does not dictate or limit which formats you can/cannot use.

Here are a few examples of Node.js code registering custom JSON Schema formats for various reasons:

For mock data generation: https://github.com/apigee-127/sway/blob/master/lib/validation/format-generators.js
For document, request and response validation: https://github.com/apigee-127/sway/blob/master/lib/validation/format-validators.js

whitlockjc on 22 Mar 2016

Thanks, @ralfhandl for confirming -- makes sense for uuid to be an (optional) format, and id to be a role.

To answer your second question, an id _may_ be a primary key, or there may be a mapping between the two. I want the resource and representation to remain decoupled from the implementation. I'll quote Mike Amundsen:

_"Your storage model is not your object model is not your resource model is not your representation model."_

DavidBiesack on 22 Mar 2016

"Your storage model is not your object model is not your resource model is not your representation model."

:+1:

whitlockjc on 22 Mar 2016

@whitlockjc Code generation, mock data generation, validation, easier use of tools that know these formats out-of-the-box, better interoperability due to common agreement on what is e.g. a time or duration, ...

There seems to be demand for more pre-defined formats, see #358, #359, #606, and https://github.com/json-schema/json-schema/wiki/%22format%22-suggestions.

ralfhandl on 23 Mar 2016

We are currently using type: number, format: decimal for money values (to make it explicit that these ought to not be mapped to some binary floating point number). Not sure if this needs standardizing.

ePaul on 23 Mar 2016

@ePaul We came up with the same solution for numeric values with decimal mantissa when mapping primitive types to JSON Schema types and formats. If we can find a third person who did this, it's a pattern :-)

We also intended to add a precision extension keyword in our JSON Schema representation for conveying the length of the decimal mantissa, e.g. precision: 34 for a 128-bit decimal floating-point type.

Is that something that you'd also find useful?

ralfhandl on 23 Mar 2016

Some of our customers needed decimal format for specifying monetary values. Hence we ended up supporting format decimal in our project AutoRest.

amarzavery on 23 Mar 2016

I think we're in agreement that there are many people using many formats outside of the documented ones. The question is whether this belongs in the OpenAPI specification as some sort of _"formal support"_ or whether this is a tooling problem. It could very well be both.

I will tag this appropriately so we can discuss.

whitlockjc on 23 Mar 2016

For code generation we need well defined formats. Since swagger spec defines the REST API, it becomes a contract that server and client need to abide by. It is always nice if your contract is explicit about everything.

Just making an analogy to make my point:
Imagine leasing a house where the contract has many loose ends left for the owner and tenant to interpret as per their choice. This wouldn't be a good scenario.

amarzavery on 23 Mar 2016

It seems that type _must_ be a constrained type. format can be interpreted by the codegeneration. for example:

type: string
format: uuid

_may_ fall back to String if UUID is not supported. But you cannot invent a type.

If that's not the mentality, then we must constrain all formats to a fixed set, which may be hard to support inside the OAI.

fehguy on 23 Mar 2016

👍5

Code generation, as is validation, are tools to me and do not necessarily need OpenAPI changes for reasons I mentioned above. But one thing I just thought of that could make supporting this make sense would be where the OpenAPI wanted to dictate a minimum set of formats all tools must support. I could see that being useful.

whitlockjc on 23 Mar 2016

@whitlockjc yes, and with a fallback to primitive types, if not supported. I don't think we should be inventing types.

In general, if we specify a format, we should dictate exactly what that is supposed to be. If a user expects a different behavior from a defined format, well, that violates the spec.

So... to make this concrete:

type: string
format: uuid

Should have a very specific format defined in the spec, specifically what @DavidBiesack mentioned:

uuid = 8HEXDIG "-" 4HEXDIG "-" 4HEXDIG "-" 4HEXDIG "-" 12HEXDIG

type: string
format: date-time

quite specifically says RFC3339 format (https://github.com/OAI/OpenAPI-Specification/blob/master/versions/2.0.md#data-types)

If I want to invent tonys-date-time then pretty much no tools will know what the heck to do with it, and would fall back to type: string.

fehguy on 23 Mar 2016

👍8

@ralfhandl what about instances where one wants to specify not the precision, which is equivalent to the number of significant digits (correct?), but one wants to specify the scale, or the number of digits following the decimal point. I believe the scale is more appropriate for fixed-point arithmetic and the precision is more appropriate for arbitrary-precision arithmetic. Sanity check, does any of this make sense?

mspiegel on 7 Jul 2016

@mspiegel This absolutely makes sense to me, a complete description of a decimal data type needs two facets:

precision - the maximum number of significant decimal digits in the mantissa
scale - the maximum number of decimal digits to the right of the decimal point - may be specified as variable

This covers the SQL data type DECIMAL(p,s) - precision: p, scale: s - as well as decimal floating-point types such as DECFLOAT34 - precision: 34, scale: variable.

I'd love to have both precision and scale as new keywords for specifying numeric types in addition to the existing minimum, maximum, and multipleOf, see #602.

ralfhandl on 12 Jul 2016

Tackling PR: #741