Following the discussion on last TSC and talking about Traits, Overlays and/or Mixings, anyone can agreed these features are strongly oriented for extensibility.
To keep things as simple as possible for implementers, I want to propose the idea of having a two levels of the specification that can be define as follows:
This will allow tools implementers to focus on:
This approach enable to:
This can help tools implementers to embrace OpenAPI 3.0 faster when targeting (2) or (1).
Divide and conquer strategy, that's it.
What do you think?
@pjmolina , I think it's a good idea. We have already discussed the idea of making overlays a separate specification, and I think this is the prevailing direction of the TSC.
But even today, we find that there are code generators, documentation formats, test consoles, etc. that do not correctly handle some features of OpenAPI 2.0 and 3.0. Common stumbling blocks include:
$ref properties We have our own KaiZen OpenAPI Normalizer to smooth out these problems for reliable downstream processing. It's not a trivial operation, and functionality like this is only going to get more important as we start adding traits, overlays, alternative schemas, and other features.
Having Extended and Canonical forms more clearly defines the role that tools like Normalizer can fulfill, as translators from Extended to Canonical form. And it removes a significant barrier to adoption of new OpenAPI versions.
Most OpenAPI usage is read-only. _Consumers_ of OpenAPI only need to _read_ and _comprehend_ the API document; they won't care about how it has been composed internally. If we can separate roles, so that OpenAPI consumers don't have to be responsible for piecing together the API description from its constituent parts, I think that would be a big win.
You nailed it @tedepstein ! It looks like we have experience the same kind of pain. ;-)
How will the canonical form represent circular schemas in a document if all $refs are to have been resolved? JSON has no mechanism to support this, and we ban the use of the related YAML features.
Fair point @MikeRalphson :
Example: _Recursive and circular references in Schema Types._
Any other uses cases where circular refs could be a problem?
@pjmolina , @MikeRalphson , here are some excerpts from the Normalizer docs that explain how this works:
When the normalizer encounters any reference, there are two ways it may process the reference:
Inline
The normalizer retrieves the referenced value (e.g. the Pet schema definition object) and replaces the reference itself with that value.Localize
The normalizer first adds the referenced object to the normalized spec that it is creating, if it is not already present, and then replaces the reference with a local reference to that object. So in the external reference example shown above, the Pet schema definition would appear directly in the OpenAPI spec produced by the normalizer, and references that were formerly external references would become local references.
(snip)
Recursive References
It is possible to set up recursive schema definitions in OpenAPI specs, through the use of references. For example, consider the following schema:```.yaml
matriarch:
$ref: "#/components/schemas/Person"
...components:
schemas:
Person:
type: object
properties:
name:
type: string
children:
$ref: "#/components/schemas/People"
People:
type: array
items:
$ref: "#/components/schemas/Person"The聽Person聽schema has a聽children聽property of type聽People, and the聽People聽schema defines an array of聽Person聽objects. Naively attempting to inline a reference to a Person object would lead to a never-ending expansion... To handle recursive references encountered during inlining, the normalizer stops inlining whenever a reference is encountered that is fully contained within another (inlined) instance of the referenced object. That recursive reference is localized rather than being inlined. In the above example, we would end up with something like this: _partially-inlined_ ```.yaml matriarch: type: object properties: name: type: string children: type: array items: $ref: "#/components/schemas/Person" ... components: schemas: Person: type: object properties: name: type: string children: type: array items: $ref: "#/components/schemas/Person" ...Here we see:
- that the top-level reference to聽Person聽as the type of the聽matriarchproperty was inlined;
- that the recursive reference to聽Person聽encountered while performing this inlining has been localized;
- that the聽Person聽schema itself was subjected to inlining, with localization of its recursive reference;
There are other details of the algorithm for handling name clashes. There's also a somewhat misguided distinction between "conforming" vs. "non-conforming" references, which we're planning to eliminate in a future revision. So I would not propose the KaiZen Normalizer documentation, in its current form, as a baseline spec for Canonical Form.
But depending on our goals for Canonical Form, we may not need to specify the algorithm to this level of detail. Maybe it's sufficient to say that Canonical Form just means:
Different processors could accomplish this in different ways, and Canonical Form does not guarantee that the output will always be exactly the same, regardless of which processor you use.
OpenAPI consumers would still need to be able to resolve _local_ references, expressed as JSON pointers within the document. And they would still need to deal with the possibility of recursive references. But they wouldn't need to deal with those other levels of complexity or general fussiness in the OpenAPI spec.
The more I think about this, the more I'm convinced that it's critical to the success of the OpenAPI ecosystem. I would go so far as to say that we _should not_ introduce traits, a.k.a. mixins (#1843), into the OpenAPI spec unless we also define a canonical or simplified form.
Anecdotal evidence: OpenAPI 3.0 adoption took much longer than we hoped. Developers were waiting for tools and platform support; tool and platform providers were waiting for demand to reach critical mass; and there was no "killer app" to drive the ecosystem to OAS v3.
You could argue that OpenAPI 3.0 was different, because 3.0-to-2.0 conversions, which might have facilitated adoption by OAS consumers, were inherently lossy and therefore not a practical solution. By contrast, traits can be resolved by a preprocessor with no information loss, and we could just let the open source community build those preprocessors.
You could also argue that, whatever complexities might exist in OpenAPI, we can leave it to the open source community to build preprocessors like Kaizen OpenAPI Normalizer and others. We don't need to formalize it in the spec.
But I think these arguments fail to address the economics of the situation.
OpenAPI _consumers_ are a broad category that includes documentation formats, test consoles, code generators, API gateways and API management platforms, among others. OpenAPI _producers_ are a much smaller category that includes editors, code-first frameworks, design tools, and maybe a few others.
If I'm an OpenAPI consumer looking at a new release of the OpenAPI spec, my goal is to support that new release and advertise that support, with minimum effort. If it's difficult for me to support a new feature like traits (and it will be difficult), I have a few options:
The first two options are obviously not very attractive. The third option might seem fine. But consider what this means:
That's a big enough barrier to almost guarantee slow adoption of OpenAPI 3.1.
Now, if OpenAPI 3.1 officially defines a Canonical Form, even in very simple terms, it changes the economics pretty dramatically for me as an OpenAPI consumer:
Not that I've heard anyone raise a strong objection to this yet. But I think this is a simple and powerful way to reduce friction in the OpenAPI ecosystem.
Different processors could accomplish this in different ways, and Canonical Form does not guarantee that the output will always be exactly the same, regardless of which processor you use.
I believe we would be creating problems for ourselves and tooling authors if we did not specify (with examples) exactly how the resolution of overlays, traits/mixins and $refs should be resolved, to a truly canonical form whereby each conforming tool produces exactly the same output when canonicalizing the same input. See for example https://en.wikipedia.org/wiki/Canonical_XML
True, the semantics of the Extended Form should generate a unique Canonical Form.
Moreover, we can provide a Test Suite to illustrate the expected input + expected output.
If the consensus is that we should go for this level of specificity, I don't object.
My position is that OpenAPI is already in need a simplified or normalized form, whether or not it's strict enough to be called a Canonical Form. And that we should not introduce traits unless we also provide this.
If a simplified form is on the critical path to traits, as I believe it should be, I just want to make sure we have enough time to do it. I would rather have a "simplified form" done in time for a 3.1 release than a "canonical form" still in progress.
Most helpful comment
The more I think about this, the more I'm convinced that it's critical to the success of the OpenAPI ecosystem. I would go so far as to say that we _should not_ introduce traits, a.k.a. mixins (#1843), into the OpenAPI spec unless we also define a canonical or simplified form.
Anecdotal evidence: OpenAPI 3.0 adoption took much longer than we hoped. Developers were waiting for tools and platform support; tool and platform providers were waiting for demand to reach critical mass; and there was no "killer app" to drive the ecosystem to OAS v3.
You could argue that OpenAPI 3.0 was different, because 3.0-to-2.0 conversions, which might have facilitated adoption by OAS consumers, were inherently lossy and therefore not a practical solution. By contrast, traits can be resolved by a preprocessor with no information loss, and we could just let the open source community build those preprocessors.
You could also argue that, whatever complexities might exist in OpenAPI, we can leave it to the open source community to build preprocessors like Kaizen OpenAPI Normalizer and others. We don't need to formalize it in the spec.
But I think these arguments fail to address the economics of the situation.
OpenAPI _consumers_ are a broad category that includes documentation formats, test consoles, code generators, API gateways and API management platforms, among others. OpenAPI _producers_ are a much smaller category that includes editors, code-first frameworks, design tools, and maybe a few others.
If I'm an OpenAPI consumer looking at a new release of the OpenAPI spec, my goal is to support that new release and advertise that support, with minimum effort. If it's difficult for me to support a new feature like traits (and it will be difficult), I have a few options:
The first two options are obviously not very attractive. The third option might seem fine. But consider what this means:
That's a big enough barrier to almost guarantee slow adoption of OpenAPI 3.1.
Now, if OpenAPI 3.1 officially defines a Canonical Form, even in very simple terms, it changes the economics pretty dramatically for me as an OpenAPI consumer:
Not that I've heard anyone raise a strong objection to this yet. But I think this is a simple and powerful way to reduce friction in the OpenAPI ecosystem.