As there are several issues proposing new formats, here a list of the possible future full picture
| Common Name | Type | Format | Comments |
| --- | --- | --- | --- |
| octet/(unsigned) byte | integer | uint8 | new: unsigned 8 bits |
| signed byte | integer | int8 | new: signed 8 bits |
| short | integer | int16 | new: signed 16 bits |
| integer | integer | int32 | signed 32 bits |
| long | integer | int64 | signed 64 bits |
| big integer | integer | | |
| float/single | number | float | |
| double | number | double | |
| decimal | number | decimal | new: decimal floating-point number, recipient-side internal representation as a binary floating-point number may lead to rounding errors |
| big decimal | number | | |
| string | string | | |
| byte | string | byte | base64 encoded characters |
| url-safe binary | string | base64url | new: base64url encoded characters - #606 |
| binary | string | binary | any sequence of octets |
| boolean | boolean | | |
| date | string | date | As defined by full-date - RFC3339 |
| dateTime | string | date-time | As defined by date-time - RFC3339 |
| time (of day) | string | time | new: As defined by partial-time - RFC3339 - #358 |
| duration | string | duration | new: As defined by xs:dayTimeDuration - XML Schema 1.1 - #359 |
| uuid | string | uuid | new: Universally Unique Identifier (UUID) RFC4122 |
| password | string | password | Used to hint UIs the input needs to be obscured. |
:+1:
The format modifier is optional. Please add integer with no format -- not confined to 32/64 bit (a.k.a. BigInteger) and also number without format which is also not constrained to floating point 32/64 bit (a.k.a. BigDecimal).
Rather than tightly couple to uuid format, I suggest just a generic id format that means _an opaque identifier string_. Whether an ID is a UUID, a hash, a databse primary key, or something else seems more like an implementation detail that should be hidden from the API specification. Many API's make extensive use of id parameters/members but do not overly-specify them as UUID strings (often, because they are not UUIDs - look at bit.ly hashes for example.)
Slightly related, I would prefer not overloading format with what is really _role_ or _attribute_. While Swagger 2.0 has password, it is not a _format_ but a _role_ that is orthogonal to format. Ditto for other PII like social security number, government id number, etc. (A more generic role for these might be masked which is a UI hint.)
Thus, while uuid is a format, id (if it were to replace uuid) a _role_, not a format.
@DavidBiesack I actually intended uuid as a format, i.e. a string that has the pattern
uuid = 8HEXDIG "-" 4HEXDIG "-" 4HEXDIG "-" 4HEXDIG "-" 12HEXDIG
No objections to adding the concept of a _role_ and an id role. Still would require a format uuid.
@ralfhandl Such things should be added into JSON Schema standard or extracted into a separate spec.
I can imagine the situation when JSON Schema Draft 5 add format with the same name but different validation rules.
Another problem,for example, to validate your DB or user input on client-side. For that purpose you would use pure JSON Schema and it would be strange to use OpenAPI-specific types over there.
And last one, tooling support for such formats will be limited to only OpenAPI tools.
I'm not against extending formats I just say that spec should reference some external doc and not to define them internally.
IMHO, OpenAPI should describe API-specific stuff and reuse existing data validation specs.
@IvanGoncharov Looking at the list of formats supported by Swagger 2.0 we only find one format that is defined by JSON Schema: date-time. The proposed new formats are in line with the existing swagger-specific formats, so adding them would not enter new ground.
Formats are an explicit extension point of JSON Schema for semantic validation, and the OpenAPI Specification could be one of the "authoritative resources that accurately describes interoperable semantic validation".
I'm not aware of other external documents describing formats for semantic validation in JSON Schema.
I oppose requiring id strings to be UUID. (I'm not sure if that was what you meant by _"Still would require a format uuid."_)
As noted, I also think it is _not_ a good idea to expose internal implementation details such as UUID format in an API definition. id strings (path parameters, query parameters, fields) should be no more than opaque string IDs. Over-specifying them as UUID is fragile and does not allow for non-breaking changes if the underlying implementation changes. Again, look at the bit.ly API which uses hash ID strings, not UUIDs.
I don't require id strings to be UUIDs, I only require uuid strings to be UUIDs. I see the string format uuid similar to the string format date-time - as a validation rule that restricts the allowed / possible values of a string parameter or property. It tells the client that some string values will be accepted, and others will be refused.
As you pointed out above the concept of an "id" is a role and not a format. So we should introduce this new concept in a new, specific way and not mix it up with format.
Could it be that your concept of an "id" is related to the concept of a "primary key"? See #587
What you expect to gain by formally supporting more formats? I realize the format could play a role in code generation, mock data generation, validation and potentially more so I figured I'd ask. I also ask because while writing Swagger tooling in the past, custom formats were easy to support without OpenAPI/Swagger being involved, especially since OpenAPI/Swagger does not dictate or limit which formats you can/cannot use.
Here are a few examples of Node.js code registering custom JSON Schema formats for various reasons:
Thanks, @ralfhandl for confirming -- makes sense for uuid to be an (optional) format, and id to be a role.
To answer your second question, an id _may_ be a primary key, or there may be a mapping between the two. I want the resource and representation to remain decoupled from the implementation. I'll quote Mike Amundsen:
_"Your storage model is not your object model is not your resource model is not your representation model."_
"Your storage model is not your object model is not your resource model is not your representation model."
:+1:
@whitlockjc Code generation, mock data generation, validation, easier use of tools that know these formats out-of-the-box, better interoperability due to common agreement on what is e.g. a time or duration, ...
There seems to be demand for more pre-defined formats, see #358, #359, #606, and https://github.com/json-schema/json-schema/wiki/%22format%22-suggestions.
We are currently using type: number, format: decimal for money values (to make it explicit that these ought to not be mapped to some binary floating point number). Not sure if this needs standardizing.
@ePaul We came up with the same solution for numeric values with decimal mantissa when mapping primitive types to JSON Schema types and formats. If we can find a third person who did this, it's a pattern :-)
We also intended to add a precision extension keyword in our JSON Schema representation for conveying the length of the decimal mantissa, e.g. precision: 34 for a 128-bit decimal floating-point type.
Is that something that you'd also find useful?
Some of our customers needed decimal format for specifying monetary values. Hence we ended up supporting format decimal in our project AutoRest.
I think we're in agreement that there are many people using many formats outside of the documented ones. The question is whether this belongs in the OpenAPI specification as some sort of _"formal support"_ or whether this is a tooling problem. It could very well be both.
I will tag this appropriately so we can discuss.
For code generation we need well defined formats. Since swagger spec defines the REST API, it becomes a contract that server and client need to abide by. It is always nice if your contract is explicit about everything.
Just making an analogy to make my point:
Imagine leasing a house where the contract has many loose ends left for the owner and tenant to interpret as per their choice. This wouldn't be a good scenario.
It seems that type _must_ be a constrained type. format can be interpreted by the codegeneration. for example:
type: string
format: uuid
_may_ fall back to String if UUID is not supported. But you cannot invent a type.
If that's not the mentality, then we must constrain all formats to a fixed set, which may be hard to support inside the OAI.
Code generation, as is validation, are tools to me and do not necessarily need OpenAPI changes for reasons I mentioned above. But one thing I just thought of that could make supporting this make sense would be where the OpenAPI wanted to dictate a minimum set of formats all tools must support. I could see that being useful.
@whitlockjc yes, and with a fallback to primitive types, if not supported. I don't think we should be inventing types.
In general, if we specify a format, we should dictate exactly what that is supposed to be. If a user expects a different behavior from a defined format, well, that violates the spec.
So... to make this concrete:
type: string
format: uuid
Should have a very specific format defined in the spec, specifically what @DavidBiesack mentioned:
uuid = 8HEXDIG "-" 4HEXDIG "-" 4HEXDIG "-" 4HEXDIG "-" 12HEXDIG
type: string
format: date-time
quite specifically says RFC3339 format (https://github.com/OAI/OpenAPI-Specification/blob/master/versions/2.0.md#data-types)
If I want to invent tonys-date-time then pretty much no tools will know what the heck to do with it, and would fall back to type: string.
@ralfhandl what about instances where one wants to specify not the precision, which is equivalent to the number of significant digits (correct?), but one wants to specify the scale, or the number of digits following the decimal point. I believe the scale is more appropriate for fixed-point arithmetic and the precision is more appropriate for arbitrary-precision arithmetic. Sanity check, does any of this make sense?
@mspiegel This absolutely makes sense to me, a complete description of a decimal data type needs two facets:
precision - the maximum number of significant decimal digits in the mantissascale - the maximum number of decimal digits to the right of the decimal point - may be specified as variableThis covers the SQL data type DECIMAL(p,s) - precision: p, scale: s - as well as decimal floating-point types such as DECFLOAT34 - precision: 34, scale: variable.
I'd love to have both precision and scale as new keywords for specifying numeric types in addition to the existing minimum, maximum, and multipleOf, see #602.
Tackling PR: #741
I think null is missing from this list
Closing this in favor of #845.
Most helpful comment
@whitlockjc yes, and with a fallback to primitive types, if not supported. I don't think we should be inventing types.
In general, if we specify a format, we should dictate exactly what that is supposed to be. If a user expects a different behavior from a defined format, well, that violates the spec.
So... to make this concrete:
Should have a very specific format defined in the spec, specifically what @DavidBiesack mentioned:
quite specifically says
RFC3339format (https://github.com/OAI/OpenAPI-Specification/blob/master/versions/2.0.md#data-types)If I want to invent
tonys-date-timethen pretty much no tools will know what the heck to do with it, and would fall back totype: string.