Website: Document the unpublished APIs

Created on 15 Sep 2020 · 31Comments · Source: kubernetes/website

This is a Feature Request

What would you like to be added

There are a few APIs that are not published as RESTful resources. Here are some examples:

The kubelet configuration: https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/apis/config/types.go#L75
The kube-scheduler configuration: https://github.com/kubernetes/kubernetes/blob/master/pkg/scheduler/apis/config/types.go#L55
The kube-controller-manager configuration: https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/apis/config/types.go#L49
The kube-proxy configuration: https://github.com/kubernetes/kubernetes/blob/master/pkg/proxy/apis/config/types.go#L108
The ABAC policy: https://github.com/kubernetes/kubernetes/blob/master/pkg/apis/abac/types.go#L26
The ImageReview policy: https://github.com/kubernetes/kubernetes/blob/master/pkg/apis/imagepolicy/types.go#L26

For these definitions, there may and may not be API paths associated with them, simply generate documentation for the struct definition would be very helpful.

Why is this needed

These types are not designed to be operated through the API server using CRUD operations. However, an administrator needs to understand their definitions in order to better configure and manage a Kubernetes cluster.

Comments

This needs some tooling effort to parse the Go source file. The tool can go to kubernetes-sigs/reference-docs repo.

kinfeature triagaccepted

Source

tengqm

Most helpful comment

@sftim I think this one is an umbrella issue. #23809 is not very relevant because the "unpublished" APIs don't have swagger JSON files.

tengqm on 15 Sep 2020

👍2

All 31 comments

xref: #23885, #23837, #21109, #21558, https://github.com/kubernetes/kubernetes/issues/84081

tengqm on 15 Sep 2020

/kind feature

@tengqm does it make sense to treat this as an umbrella issue?

Also potentially relevant (as an aside only) to #23809

sftim on 15 Sep 2020

@sftim I think this one is an umbrella issue. #23809 is not very relevant because the "unpublished" APIs don't have swagger JSON files.

tengqm on 15 Sep 2020

👍2

/triage accepted

sftim on 8 Oct 2020

For this umbrella issue, I would include the gRPC socket that serves the Pod Resources API / also see https://github.com/kubernetes/kubernetes/pull/92165

What do you think @tengqm ? Is this the right home for tracking that?

sftim on 12 Oct 2020

@sftim I'm don't have a strong opinion on where the API docs are served. Maybe just some simple markdown tables (or HTML tables) capturing the comments in the source code would be a good starting point.

tengqm on 12 Oct 2020

Hi @tengqm . I need to read through the changes.

Can you elaborate on this comment:

These types are not designed to be operated through the API server using CRUD operations. However, an administrator needs to understand their definitions in order to better configure and manage a Kubernetes cluster.

How does an administrator use the content (component flags, client code)?

kbhawkey on 12 Nov 2020

@kbhawkey See this section: https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/#create-the-config-file

When we are trying to tell admins how to create/customize kubelet config file, we are forcing them to read Go code. The configuration file is nothing more than a JSON data. See the need for this kind of reference?

tengqm on 13 Nov 2020

This still feels like a really useful improvement.

sftim on 13 Nov 2020

Hi, @alculquicondor
What do you think about the generated scheduler reference documentation,
https://tengqm.github.io/doc/kube-scheduler-config.v1.html
https://tengqm.github.io/doc/kube-scheduler-config.v1beta1.html

Are these issues related:
https://github.com/kubernetes/kubernetes/issues/88919
https://github.com/kubernetes/kubernetes/issues/95446

kbhawkey on 13 Nov 2020

Hi @tengqm . I think this information is useful ❇️ . Some questions:

Does the information represent APIs or configuration interfaces? If the information is not an API, I would
not call it an API.
I'd like to see the pages generate as docs pages (front matter, common left nav, header, footer, ...) or generate as headless pages to be included in another page. Since the docs are generated from code, it makes sense to accept text changes only from the code unless there is an issue with the generator.
How often do the types change (versioning)?
The generated page includes the package name, an inline TOC (only for the resources type(s)?), and tables for the type fields and descriptions and included definitions.
Do you want to link to this type of page from docs pages or include snippets of information in the docs?
Does it makes sense to create a new section for configuration?
How do these pages fit in with the component tool pages (flags)?

kbhawkey on 13 Nov 2020

hi, i do agree we should publish the missing APIs (which we also call component config API, btw).

on here i'm asking a few questions to @tengqm.
https://github.com/kubernetes/kubernetes/pull/96215#issuecomment-726809388

my primary questions are around process, since i did not understand why are we using godoc to generate these ~swagger~ docs and the technical limitations around that. godocs are targeting developer users and not admin users, but since we didn't have anything else for kubeadm, we added some authored content in there.
https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2

ideally we should use the same process for generating the swagger docs as for the core kubernetes APIs, hosted at:
https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/

neolit123 on 13 Nov 2020

kubernetes/kubernetes#88919 would indeed be kube-scheduler part of this. This API is for configuring kube-scheduler at startup. It is read from a file. We also discussed some more here: kubernetes-sigs/reference-docs#138

What do you think about the generated scheduler reference documentation,

That seems directly obtained from the code, which is what we want. We just never got a contributor to work on the integration of generation scripts to the website.

alculquicondor on 13 Nov 2020

@tengqm @alculquicondor during the weekend i played with generating swagger.js from our API types and producing a page similar to:
https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/

here https://github.com/kubernetes-sigs/reference-docs/issues/138 i see you've discussed the tool https://github.com/ahmetb/gen-crd-api-reference-docs. i have not tried it by everything else did not work for me, or was simply poorly documented - including kube-openapi.
EDIT: ok, so gen-crd-api-reference-docs works, but it generates a bare-bone HTML. gen-apidocs does the same but needs swagger.json.

so i wrote yet another parser for go types-> swagger.js
https://gist.github.com/neolit123/6ffe06798e3583c9535afab6f909e75f

once i had a swagger.js i tried feeding it in https://github.com/kubernetes-sigs/reference-docs/tree/master/gen-apidocs, but soon realized that gen-apidocs is making a number of hardcoded assumptions and only works for core APIs. minimal patches were required...

my vote goes for producing swagger json and adapting gen-apidocs to support any component's API and publishing it potentially at a separate page or unifying all component-config API under the same page.
if swagger.json and gen-apidocs is not something we'd like to use, i'd like to understand why?

other things we need to consider:

embedding examples in documentation is often not sufficient. while there are examples at https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/ we have a number of pages at k8s.io on how to actually use a API. we should do the same for component APIs.
the current model of kubeadm embedding authored content in its API doc.go is not something that works very well for users:
https://github.com/kubernetes/kubernetes/blob/master/cmd/kubeadm/app/apis/kubeadm/v1beta1/doc.go#L22

neolit123 on 16 Nov 2020

I don't have an opinion on what tools are best. I'll leave the decision to sig-docs.

alculquicondor on 16 Nov 2020

@neolit123 Actually you don't need a swagger/OpenAPI spec for this kind of types. What we are trying to document are some configuration structures rather than fully fledged RESTful APIs. In a swagger/OpenAPI, peopled define not only the structural representations that flow between the server and the client, they define other things such as API path, HTTP verbs to use, content-type to negotiate, query strings to accept, status code for each API, etc.

The swagger json file is generated by the kube-apiserver today. Behind the scene there are some preprocessing logics that parse the Go source code and generates some intermediate files containing half-baked API specification. When you inquire an API server for the swagger spec, the API server goes through all those struct definitions and yield a JSON result for you. There is no magic. At the end of the day, it is all about gengo.

The genref tool I'm proposing skips all the unnecessary conversions and generate the HTML output directly. If needed, we can easily tune it to generate markdowns.

As for tuning kubeadm comments to generate better docs, that is a different topic. The "kubeadm embdded authored contents" doesn't look very ugly if we are taking care of the format with some tweakings:

https://tengqm.github.io/doc/kubeadm-config.v1beta1.html

The output was generated from exactly the code in the PR I proposed to kubeadm source code.

tengqm on 17 Nov 2020

Actually you don't need a swagger/OpenAPI spec for this kind of types. What we are trying to document are some configuration structures rather than fully fledged RESTful APIs. In a swagger/OpenAPI, peopled define not only the structural representations that flow between the server and the client, they define other things such as API path, HTTP verbs to use, content-type to negotiate, query strings to accept, status code for each API, etc.

while kubeadm currently doesn't serve anything, the components like kubelet, kube-scheduler, kube-controller-manager and kube-proxy do expose endpoints which are currently mostly undocumented. the openapi format can help with that. so i slightly disagree if we are planning to discard openapi.

The swagger json file is generated by the kube-apiserver today. Behind the scene there are some preprocessing logics that parse the Go source code and generates some intermediate files containing half-baked API specification. When you inquire an API server for the swagger spec, the API server goes through all those struct definitions and yield a JSON result for you. There is no magic. At the end of the day, it is all about gengo.

yes, i figured that out. except i still cannot make the kube-openapi generate the structs.

The genref tool I'm proposing skips all the unnecessary conversions and generate the HTML output directly. If needed, we can easily tune it to generate markdowns.
As for tuning kubeadm comments to generate better docs, that is a different topic. The "kubeadm embdded authored contents" doesn't look very ugly if we are taking care of the format with some tweakings:

the output looks nice, but i think we should close the kubeadm PR and move these changes on the side of the generation tool as mutations. also, all the authored content in the kubeadm doc.go feels like should be moved to a MD file (this was discussed a few times already with the kubeadm maintainers)

neolit123 on 17 Nov 2020

@neolit123 Alright. If kubeadm decides to generate fully fledged OpenAPI/swagger specs, that is great. We can wait for that to happen. We have to clone the kubeadm code in order to generate the struct definitions today; that is another tech debt.

Moving content out for the doc.go into a markdown file is a good idea. I'm all for it.

tengqm on 17 Nov 2020

@kbhawkey Sorry for the late response.

Does the information represent APIs or configuration interfaces? If the information is not an API, I would not call it an API.

We cannot decide how the development team (SIGs) want to call them. Some are actually just configuration data structures, others are exposed by the API endpoint of the component in question. For example, in kubelet, it is called a config, and the config is separated from kubelet apis where most APIs are only defined for protobuf rather than RESTful clients.

I'd like to see the pages generate as docs pages (front matter, common left nav, header, footer, ...) or generate as headless pages to be included in another page. Since the docs are generated from code, it makes sense to accept text changes only from the code unless there is an issue with the generator.

That is doable and can be done easily. We have some Go templates for rendering the HTML output, just for show cases purpose. We can have templates for markdowns.

How often do the types change (versioning)?

It really depends. However, I do believe each SIG is tracking changes to these types as official APIs, I mean, they are following the versioning practices. Since these types are bound to the binaries (e.g. kubeadm, kubelet) that consume them or expose them, we can safely update the generated docs once per release cycle, just as we do for API reference.

The generated page includes the package name, an inline TOC (only for the resources type(s)?), and tables for the type fields and descriptions and included definitions.

Yes. Only (exposed) resource types are included. If another type definition Bar is referenced from Foo in current package but Bar is not defined in current package, we can extract it as well.

Do you want to link to this type of page from docs pages or include snippets of information in the docs?

My original thought is just links. Point readers to the rendered HTML pages rather than source code or Go docs.

Does it makes sense to create a new section for configuration?

Yes, I think so. Maybe a subsection under reference.

How do these pages fit in with the component tool pages (flags)?

The overall trend is to deprecate all command line flags in favor of config files. Most of the so called "unpublished" APIs are actually about the config file format. So ... maybe we need both today so people know ... if I'm specifying this as command line flag, I should use --api-server, ... oh wait, that flag is marked as deprecated... oh I see, I should use apiServer in the configuration file which is an equivalent.

tengqm on 17 Nov 2020

@tengqm
for your comment here https://github.com/kubernetes/kubernetes/pull/96215#issuecomment-728694950

I'm fine with closing this PR. We have an issue to fix here and somewhere else. Generating user facing docs from source code comment where the fields have to have its first character in upper case so that they are accessible out of the package. While at the same time, the users see some different spelling and get confused.

the tooling should just parse the json tag of the field as the "field name".
this is possible with go's AST parser. https://golang.org/pkg/go/ast/#Field

Alright. If kubeadm decides to generate fully fledged OpenAPI/swagger specs, that is great. We can wait for that to happen. We have to clone the kubeadm code in order to generate the struct definitions today; that is another tech debt.

as mentioned above, kubeadm has the least need for OpenAPI/swagger specs. it's the other components that may want to document all of their endpoints / requests / responses.

Moving content out for the doc.go into a markdown file is a good idea. I'm all for it.

what is SIG Docs' preference?

instead of having content in doc.go file should we start adding readme.md's in API packages:
https://github.com/kubernetes/kubernetes/tree/master/cmd/kubeadm/app/apis/kubeadm/v1beta2
that include similar authored content or should we move this content to k8s.io pages?

i'm leaning towards moving this to a k8s.io page.

some history on this topic is that in the past SIG Docs objected to the kubeadm maintainers adding documentation in a doc.go file, but the API was moving too fast and it was a bit hard to maintain it at k8s.io.

cc @jimangel @fabriziopandini

neolit123 on 17 Nov 2020

@neolit123

the tooling should just parse the json tag of the field as the "field name".

The tool is doing exactly that. The problem is about the comment lines, from which the docs (including swagger spec) are generated.

Moving content out for the doc.go into a markdown file is a good idea. I'm all for it.
what is SIG Docs' preference?

Is it possible to paste that doc.go content into a markdown and find a place for it in website? Referencing docs at website from source code is an acceptable practice, maybe?

tengqm on 18 Nov 2020

The tool is doing exactly that. The problem is about the comment lines, from which the docs (including swagger spec) are generated.

this is a problem here too:
https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/

so instead of adapting all APIs to follow a new style, we should make the tooling post-process the comments.
and i wouldn't say it's much of a problem too.

"io.k8s.api.admissionregistration.v1.MutatingWebhook": {
      "description": "MutatingWebhook describes an admission webhook and the resources and operations it applies to.",
      "properties": {
        "admissionReviewVersions": {
          "description": "AdmissionReviewVersions is an ordered list of preferred `AdmissionReview` versions the Webhook expects. API server will try to use first version in the list which it supports. If none of the versions specified in this list supported by API server, validation will fail for this object. If a persisted webhook configuration specifies allowed versions and does not include any versions known to the API Server, calls to the webhook will fail and be subject to the failure policy.",

->
pseudo markdown:

`mutatingWebhook`: `mutatingWebhook` desribes....
  `admissionReviewVersions`: `admissionReviewVersions` is an ...

Is it possible to paste that doc.go content into a markdown and find a place for it in website? Referencing docs at website from source code is an acceptable practice, maybe?

we can certainly do that. wanted to see if someone has objections, this also means we have to link to the new page from kubeadm docs (also from k/kubeadm), so potentially we should make this after 1.20 releases.

neolit123 on 18 Nov 2020

@neolit123 I did considered that possibility -- having the post processing tool to make the comments more user friendly. However, it turned to be impractical. Let me give you a few examples where we will need an NLP (natural language processing) engine to correctly figure out what the text means:

Example 1

    // Expires specifies the timestamp when this token expires. Defaults to being set
    // dynamically at runtime based on the TTL. Expires and TTL are mutually exclusive.
    Expires *metav1.Time `json:"expires,omitempty"`

Example 2

    // TLSBootstrapToken is a token used for TLS bootstrapping.
    // If .BootstrapToken is set, this field is defaulted to .BootstrapToken.Token, but can be overridden.
    // If .File is set, this field **must be set** in case the KubeConfigFile does not contain any other authentication information
    TLSBootstrapToken string `json:"tlsBootstrapToken,omitempty"`

tengqm on 19 Nov 2020

Unless Go adopts a (strong) convention for Markdown, or other formatting, in comments, I recommend _against_ automatically inferring formatting from comments.

sftim on 19 Nov 2020

@sftim

Unless Go adopts a (strong) convention for Markdown, or other formatting, in comments, I recommend against automatically inferring formatting from comments.

it can be a problem in general. but it feels like in k8s we should adopt a style.
yet, as i pointed above, the API docs at https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/ have had the same problem where comments just talk about PascalCase fields but i don't think i've heard of complains from users.

@tengqm i think one way to solve this is if field references in comments are always prefixed with ., unless its the first word in the comment:

// Expires specifies the timestamp when this token expires. Defaults to being set
// dynamically at runtime based on .TTL. .Expires and .TTL are mutually exclusive.
Expires *metav1.Time `json:"expires,omitempty"`

| field | description |
| ---------|------------ |
| expires | expires specifies the timestamp when this token expires. Defaults to being set dynamically at runtime based on ttl. expires and ttl are mutually exclusive. |

neolit123 on 19 Nov 2020

adopt a style

Maybe a KEP? It does need buy-in.

sftim on 19 Nov 2020

it does. until then we can leave the field descriptions unprocessed, but continue to think how to publish this documentation.
by that i mean, we can publish the component config documentation with unprocessed descriptions.

neolit123 on 19 Nov 2020

👍1

Doesn't block either PR but registrycredentials.k8s.io/v1alpha1 from https://github.com/kubernetes/website/pull/24929 might be relevant here too.

sftim on 19 Nov 2020

ABAC policy is also mentioned in https://github.com/kubernetes/website/issues/23885

sftim on 19 Nov 2020