Sphinx: Regular expressions removed from sphinx/domains/c.py (e.g. c_funcptr_sig_re) break downstreams

Created on 6 Apr 2020  路  17Comments  路  Source: sphinx-doc/sphinx

Describe the bug

c_funcptr_sig_re amongst other regular expressions have been removed from sphinx/domains/c.py in commit 0f49e30c51. These are used in the Linux kernel html documentation scripts.

When trying to compile the HTML documentation for Linux using sphinx 3.0.0 this leads to:

  SPHINX  htmldocs --> file:///build/linux-rt/src/linux-5.6.2/Documentation/output
  PARSE   include/uapi/linux/dvb/audio.h
  PARSE   include/uapi/linux/dvb/ca.h
  PARSE   include/uapi/linux/dvb/dmx.h
  PARSE   include/uapi/linux/dvb/frontend.h
  PARSE   include/uapi/linux/dvb/net.h
  PARSE   include/uapi/linux/dvb/video.h
  PARSE   include/uapi/linux/videodev2.h
  PARSE   include/uapi/linux/media.h
  PARSE   include/uapi/linux/cec.h
  PARSE   include/uapi/linux/lirc.h
Running Sphinx v3.0.0

Extension error:
Could not import extension cdomain (exception: cannot import name 'c_funcptr_sig_re' from 'sphinx.domains.c' (/usr/lib/python3.8/site-packages/sphinx/domains/c.py))
make[1]: *** [Documentation/Makefile:81: htmldocs] Error 2

This change was introduced without any notification in the changelog of this release and breaks downstream builds of the Linux Kernel in distributions (when building the HTML documentation).

To Reproduce

Steps to reproduce the behavior:

wget https://cdn.kernel.org/pub/linux/kernel/v5.x/linux-5.6.2.tar.xz
tar xvJf linux-5.6.2.tar.xz
cd linux-5.6.2/
make htmldocs

Expected behavior
The Linux HTML documentation is generated.

Your project
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/sphinx/cdomain.py#n40

Screenshots
If applicable, add screenshots to help explain your problem.

Environment info

  • OS: Arch Linux
  • Python version: 3.8.2
  • Sphinx version: 3.0.0
  • Sphinx extensions: sphinx_rtd_theme
  • Extra tools:

Additional context
Add any other context about the problem here.

c enhancement

All 17 comments

I may have missed it, but do not believe that functionality was documented before, and it was thus an implementation detail.
The good news is that I believe you can get by with simply removing all of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/sphinx/cdomain.py, the Sphinx implementation is much improved now, and will handle function-like macros and much more.

Hm, it seems I have to remove 'cdomain' from conf.py (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/conf.py#n39).

This seems to work for the time being. Not entirely sure whether the output will be the same now though.

Right, I may have been a bit to fast in my scanning of the file. Note again that all of this is playing with internals of Sphinx so not too many guarantees can be made.

The duplicate description handling may not be needed any more, or at least it may hide some underlying problems in the documentation elsewhere. I suggest starting out without this.

The handling of function-like macros is redundant and can simply be removed.

The :name: option is a quite tricky construct which is not really possible any more. As far as I could find there are 256 instances of it, mostly (all?) related to different ways of calling ioctl.
Taking media/uapi/cec/cec-ioc-adap-g-caps.rst as an example. Currently is has the rst:

Name
====

CEC_ADAP_G_CAPS - Query device capabilities

Synopsis
========

.. c:function:: int ioctl( int fd, CEC_ADAP_G_CAPS, struct cec_caps *argp )
    :name: CEC_ADAP_G_CAPS

Arguments
=========

``fd``
    File descriptor returned by :c:func:`open() <cec-open>`.

``argp``

Changing it to the following may be a way forward

Name
====

.. c:macro:: CEC_ADAP_G_CAPS

   Query device capabilities

Synopsis
========

:c:expr:`ioctl(fd, CEC_ADAP_G_CAPS, argp)

Arguments
=========

:c:expr:`int` ``fd``
    File descriptor returned by :c:func:`open() <cec-open>`.

:c:expr:`struct cec_caps *` ``argp``

(Except: cec-open is not a function name (not sure what was meant with the xref))

Note that c:expr is a new role for typesetting C types or C expressions in text (not declaring things).

I will probably add "namespace" directive to the C domain so fd and argp can be declared as local convenience symbols and cross-referenced in the synopsis. (directives will be available in Sphinx 3.1) E.g., it could become:

Name
====

.. c:macro:: CEC_ADAP_G_CAPS

   Query device capabilities

Synopsis
========

.. c:namespace:: @desc_CEC_ADAP_G_CAPS

:c:expr:`ioctl(fd, CEC_ADAP_G_CAPS, argp)

Arguments
=========

.. c:var:: int fd

    File descriptor returned by :c:func:`open() <cec-open>`.

.. c:var:: struct cec_caps *argp

See the Sphinx documentation for the C++ domains namespace directive and anon entities.

The namespacing directives are in the 3.x branch (future v3.1) so I'll close this for now. Feel free to reopen, or open other issues, if you encounter problems.

Hey, any updates on this?

I get an error when running 'make htmldocs' for linux 5.7

`$ make htmldocs
SPHINX htmldocs --> file:///home/dwls/devel/media_tree/Documentation/output
make[2]: Nothing to be done for 'html'.
Running Sphinx v3.0.4
enabling CJK for LaTeX builder

Extension error:
Could not import extension cdomain (exception: cannot import name 'c_funcptr_sig_re' from 'sphinx.domains.c' (/usr/lib/python3.8/site-packages/sphinx/domains/c.py))
make[1]: * [Documentation/Makefile:82: htmldocs] Error 2
make: *
[Makefile:1595: htmldocs] Error 2
`

Hey, any updates on this?

Well, not other than 3.1 has been released, so the namespacing trick shown above can be used. Otherwise I believe there is nothing left to do on the Sphinx side. If new features are desired, feel free to open issues.

@dwlsalmeida did you follow kernel-docs and running make in a virtualenv?

If you have problems with the kernel build, please first read the kernel-docs and scrap or contact the linux-doc mailing list.

update 2020-08-12: linux-doc ML "Documentation: build failure with sphinx >= 3.0.0"

@jakobandersen @return42 Hey thanks for chiming in!

It works after running make in a virtualenv.

Thanks.

@jakobandersen I'm the maintainer of the media subsystem at the Linux Kernel.

Just to put some context, when dealing with the Linux Kernel documentation, we need to document system calls. The most important system calls are implemented via libc, usually with an userspace with the same name (or with a close name) as the actual function call.

As the goal of such documentation is to document how an userspace program will do a system call, we ended referring to the libc name for it. So, an userspace program opening a device node under /dev/<foo> would simply use open (devnode_name, flags). However, internally, The Kernel has per-subsystem handlers for such system calls that actually depends on what is the type of the device node.

So, while it could make sense to nave a global Kernel documentation for some system calls like open(), close(), some subsystems need to have special documentation for that (as opening a device does different things than opening a file - it may even load firmware during open() time, have different error results, and may have restrictions if more than on thread - or different programs - try to do concurrent open operations.

Those subsystem-specific system call documentation may refer to a global system call definition at the at the main namespace (although currently, there's no such cross-references ATM).

In the specific case of ioctl(), its meaning is completely dependent of the ioctl number. Using the the .. c:macro:: solved this specific one. Thanks for that!

However, taking the HDMI Consumer Electronics subsystem Control (CEC) as an example, all files under the toctree:: for Part V - Consumer Electronics Control API should point to the specific
open(), close() and poll() documentation from the files included there at the cec-api.rst file.
As an exercise, I tried to use the .. c:namespace:: there, in order to split the system calls from CEC from the system calls declared by other subsystems:

Didn't work. I'm still getting those warnings, as the documentaion of open() and close() is conflicting with the ones already defined by other subsystems at the top namespace.

Documentation/userspace-api/media/cec/cec-func-close.rst:23: WARNING: Duplicate C declaration, also defined in 'media/dvb/video-fclose'.
Declaration is 'int close( int fd )'.
Documentation/userspace-api/media/cec/cec-func-close.rst: WARNING: Duplicate C declaration, also defined in 'media/dvb/video-fclose'.
Declaration is 'int close(int fd)'.
Documentation/userspace-api/media/cec/cec-func-close.rst: WARNING: Duplicate C declaration, also defined in 'media/dvb/video-fclose'.
Declaration is 'int fd'.
Documentation/userspace-api/media/cec/cec-func-open.rst:22: WARNING: Duplicate C declaration, also defined in 'media/dvb/video-fopen'.
Declaration is 'int open( const char *device_name, int flags )'.
Documentation/userspace-api/media/cec/cec-func-open.rst: WARNING: Duplicate C declaration, also defined in 'media/dvb/video-fopen'.
Declaration is 'int open(const char *deviceName, int flags)'.
Documentation/userspace-api/media/cec/cec-func-open.rst: WARNING: Duplicate C declaration, also defined in 'media/dvb/video-fopen'.
Declaration is 'int flags'.

In time: I'm building it with Sphinx 3.2.1.

Thanks for the extra context. Feel free to bring more examples so we can find a good solution. (Do you have a link to where I can get the code with the example?)
The namespacing should work. Judging from the first patch at the link you sent it looks like Documentation/userspace-api/media/cec/cec-api.rst got namspacing, but Documentation/userspace-api/media/cec/cec-func-close.rst didn't.

Just saw the question before the patch: yes, the namespacing is per rst file.

Thanks for the extra context. Feel free to bring more examples so we can find a good solution. (Do you have a link to where I can get the code with the example?)

The namespacing should work. Judging from the first patch at the link you sent it looks like Documentation/userspace-api/media/cec/cec-api.rst got namspacing, but Documentation/userspace-api/media/cec/cec-func-close.rst didn't.

I placed the code here:

https://git.linuxtv.org/mchehab/experimental.git/log/?h=test_cec_sphinx3

You can build the docs with:

  $ make htmldocs

But this will take a lot of time to build, and will generate about 3K warnings due to the lack of the cdomain (Btw, it sounds that building the docs with Sphinx 3.x takes twice the time as it used to take with Sphinx 2.x).

In order to get a cleaner build, I'm doing, instead:

make SPHINXDIRS=userspace-api htmldocs

Thanks.

There was a bug in 3.x, being fixed for 3.2.1 that had a massive impact on the run time. Feel free to open an issue if there are still performance problems.

Just saw the question before the patch: yes, the namespacing is per rst file.

It would be a lot better if the namespace can be defined together with the file containing the TOC.
Just the V4L doc has 229 files:

ls Documentation/userspace-api/media/v4l/*.rst|wc -l
229

Every time a syscall is referred there, it is for the V4L subsystem.

The same pattern applies to other subsystems. Btw, we have files using an extension that converts plain C files into markups (called kernel-doc). If the file that includes them would belong to a c namespace, all such markups should also be considered to belong to the same domain by default.

Another thing: how can the namespace be overrided on a reference? I mean, if we have two references (like the c:function::open() that could be declared on the virtual filesystem namespace and also at the V4L2 namespace), how can a single rst file can reference to both?

It would be a lot better if the namespace can be defined together with the file containing the TOC.

This is unfortunately not something that can be addressed in the C domain alone. Each rst file is parsed into doctrees individually, and only in a later build phase are cross-references resolved, along with a lot of other transformations.

The same pattern applies to other subsystems. Btw, we have files using an extension that converts plain C files into markups (called kernel-doc). If the file that includes them would belong to a c namespace, all such markups should also be considered to belong to the same domain by default.

Note that the namespacing affects all declarations, also for structs. Assuming that is fine, my best recommendation is somehow to get an appropriate namespace directive inserted in the top of the relevant files. There are configuration variables in Sphinx for inserting extra rst (https://www.sphinx-doc.org/en/master/usage/configuration.html#confval-rst_prolog), but they are applied to all documents.

Another thing: how can the namespace be overrided on a reference? I mean, if we have two references (like the c:function::open() that could be declared on the virtual filesystem namespace and also at the V4L2 namespace), how can a single rst file can reference to both?

You would need to qualify the name with a leading dot to make the lookup start at global scope. E.g.,

.. c:function:: int open(const char *path)


.. c:namespace:: @cec

.. c:function:: int open(const char *path)


.. c:namespace:: @cec.@foo

.. c:function:: int open(const char *path)

- The root: :c:func:`.open`
- In cec: :c:func:`[email protected]` or :c:func:`@cec.open`
- In cec.foo: :c:func:`.@[email protected]` or :c:func:`open`

Or with prettier names:

- The root: :c:func:`open <.open>`
- In cec: :c:func:`open <@cec.open>`
- In cec.foo: :c:func:`open <@foo.open>`

This is unfortunately not something that can be addressed in the C domain alone. Each rst file is parsed into doctrees individually, and only in a later build phase are cross-references resolved, along with a lot of other transformations.

Ok. This will make harder to maintain the Kernel documentation, but let's try to live with that.

Btw, I just opened an issue related to the Kernel documentation build:

https://github.com/sphinx-doc/sphinx/issues/8241

Building the docs with Sphinx 3.x are generating 1725 warnings. While some of them are due to namespace conflicts, others seem to be caused by some issues at the new Sphinx C domain code.

Was this page helpful?
0 / 5 - 0 ratings