Sphinx: Application object should allow custom builders to be passed in

Created on 25 Apr 2016 · 14Comments · Source: sphinx-doc/sphinx

At the moment, the Sphinx object calls _init_builder which checks self.builderclasses.

In my situation, I'm calling Sphinx() programmatically via python. So far I've only seen custom builders (such as https://github.com/shirou/sphinxcontrib-indesignbuilder/blob/master/sphinxcontrib/indesignbuilder.py) are ones added via extension. I think this makes it odd to test since its more inclined toward the flow of using it via CLI, where as I'm using the sphinx object directly.

My temporary solution is to enter a fake buildername to Sphinx ('html'), then do app.add_builder(MyBuilderClass) and app._init_builder('mybuilderclass').

I'd prefer to add an optional keyword argument where a custom builder class could be passed. Any disagreements to that?

Source

tony

Most helpful comment

Even if you were to use confoverrides, you'd still be entering a fake buildername.

No, you don't need to give a fake buildername.
Following code might work well.

app = Sphinx(..., buildername='custombuilder', confoverrides={'extensions': 'your.custom.builder'})

tk0miya on 27 Apr 2016

👍2

All 14 comments

Could you give some more detail about your use case? I'm not yet convinced you cannot call add_builder from either your conf.py or an extension (which could be added via Sphinx(confoverrides=).)

lehmannro on 26 Apr 2016

@lehmannro: Hey! Hope you're doing well.

I'm building a sphinx-based django site. I need to invoke it pythonically, rather than via Makefile. I have multiple sphinx projects which are going to be generated. So I'm using sphinx to programmatically build a lot of objects with a custom builder, so what I find is I'm entering a false buildername just to add it and switch it every time.

Also, since the project build is being triggered via the application object (and not sphinx-build/makefile), there is even more impetus on declaring the builder from the outset.

Builders, in the big picture represent whats being targeted (kinda of like cmake -G Ninja, cmake -G "Unix Makefiles"). They're of a higher order than a normal project settings or extensions you'd set in configuration.

I'm not yet convinced you cannot call add_builder from either your conf.py or an extension (which could be added via Sphinx(confoverrides=).)

You can. This is meant for cases where sphinx is being invoked via Make. And I already use it that way in normal cases on normal projects. In this case, I'm using a lot of the internal sphinx machinery programatically.

In my example, I show how I add the custom builder programmatically. Even if you were to use confoverrides, you'd still be entering a fake buildername.

Most important thing to know is this doesn't change behavior in anyway! The application object by design already requires a buildername. Accepting an optional builder class in init is more consistent with how sphinx views the builder. This doesn't introduce any incompatibilities, but makes things much more straight forward when invoking the app via Python. :smile:

tony on 26 Apr 2016

Even if you were to use confoverrides, you'd still be entering a fake buildername.

No, you don't need to give a fake buildername.
Following code might work well.

app = Sphinx(..., buildername='custombuilder', confoverrides={'extensions': 'your.custom.builder'})

tk0miya on 27 Apr 2016

👍2

Most important thing to know is this doesn't change behavior in anyway! The application object by design already requires a buildername. Accepting an optional builder class in init is more consistent with how sphinx views the builder.

While it's true that builders are _important_ in Sphinx, they're not magic in any way. You can add them to your Sphinx object just like you add directives, autodocumenters, or anything else. My main concern with this PR is it's going down a slippery slope (and Sphinx() already _has_ a very loaded signature): If we're adding _another_ way to call app.add_builder, why not one for app.add_config_value, or add.add_source_parser, or ...?

_There should be one-- and preferably only one --obvious way to do it._

This being said, maybe we can instead improve the documentation to cover your case?

lehmannro on 27 Apr 2016

👍1

@tk0miya, @lehmannro Hey guys, I hope all is well!

The scope of this issue is only to add an optional argument for a builder class (#2489). I am open to other solutions. I am also expanding the topic a bit, because I believe I have a different perspective on how I view proposed changes to the internal API. I am working on a full time project using sphinx internals.

To point out some context: I have a couple of customizations I have on top of sphinx that may be PR worthy, but this one in particular really is like nails on the chalkboard when invoking sphinx programatically.

While it's true that builders are important in Sphinx, they're not magic in any way. You can add them to your Sphinx object just like you add directives, autodocumenters, or anything else.

There should be one-- and preferably only one --obvious way to do it.

Agreed. For the most part - sometimes there may be a _programatic_ and a _declarative_ to do the same thing. When we're talking about using confoverrides, we're thinking in the mindset of someone invoking sphinx-build. Configurations are for people using that front end.

If I'm using Sphinx programatically, I'm now going out of my way to use config declarations to switch on features already in the app object. In this context, its a race condition. My builder isn't registered, but the buildername is required.

I think its good I'm making the issue because already being a developer on a project, familiarity can disguise itself as obvious.

So here are some talking points I have:

The builder contrib isn't (relatively speaking) an active area compared to custom roles and directives, so the area of development is nascent, no hard opinions exist on whether creating an extension to register and set a builder is the one way.
A bulk of the sphinx extensions that exist tend to be in the form of role and directive functionality. Builders stand out by themselves compared to normal extensions in how they react in to ecosystem.

I think abstracting the concept visibly in the API makes it easier for people who need to understand code. I think the idea of conflating extensions (which are commonly roles, directives, autodoc stuff) with builders is confusing for new hires trying to understand and already large codebase. They can't immediately understand the separation and high level role the builder is.

An analogy I'd put to it is in Django, ORM configs ENGINE point directly to a db backend. You wouldn't be adding it on as a django app. The builder in sphinx is more like an engine in django, and less like an app.

The suggested way of setting builder via confoverrides is still unwieldy. In this case, I'm sending in a dict via a kwarg, that has a string to a module, to get a side-effect of it running setup(), to register a builder for a required argument (buildername) in that same signature.

My main concern with this PR is it's going down a slippery slope (and Sphinx() already has a very loaded signature)

I agree with that sentiment that we don't want to make the signature bigger. I do care about that and want to strike a balance with that.

However, I do find it a good thing to get feedback and add convenience for internal API users.

This being said, maybe we can instead improve the documentation to cover your case?

I'm using the internal machinery of sphinx, it won't be applicable to everyday users invoking sphinx-build.

(As an aside to "There should be one-- and preferably only one --obvious way to do it." I agree.

But I also find that while python projects evolved, it doesn't preclude composability. SQLAlchemy's declarative and classical mappings. These are two ways to do essentially the same thing. It also doesn't preclude configurability, there are tons of articulation points where you can throw in an engine/connection object SQLAlchemy.

There's a high level obvious way to do something, but that doesn't necessarily mean the front-end runs the show and what internals look like. Internal API's are not often not set in stone, in some cases, after trial and error and lots of battle hardening, they can become stable (django's Model._meta api.))

tony on 27 Apr 2016

Hey @tony, I appreciate your input. I remain unconvinced the current API is lacking and the proposed solution is an improvement. I'd ask you to work with us and find another solution to help your use case.

When we're talking about using confoverrides, we're thinking in the mindset of someone invoking sphinx-build. Configurations are for people using that front end.

What makes you say that? If advice from two Sphinx core developers doesn't convince you this is indeed the canonical solution, I'm not sure what would.

The builder contrib isn't (relatively speaking) an active area compared to custom roles and directives, so the area of development is nascent, no hard opinions exist on whether creating an extension to register and set a builder is the one way.

If you look at the existing third-party builders, they all use app.add_builder. What would constitute a "hard opinion" for you?

An analogy I'd put to it is in Django, ORM configs ENGINE point directly to a db backend.

The analogy is ill-suited, because Django's ENGINE _also_ points to a file which then sets up the actual connector (see Third-party Engines) — very much like the extension that sets up a builder.

I think abstracting the concept visibly in the API makes it easier for people who need to understand code. I think the idea of conflating extensions (which are commonly roles, directives, autodoc stuff) with builders is confusing for new hires trying to understand and already large codebase. They can't immediately understand the separation and high level role the builder is.

I don't think this categorization is particulary well-founded: Directives (and others) are completely orthogonal to builders, in that they don't even handle the same stages (parsing vs. roughly writing.) There are other things, such as source parsers, which are as important (one could argue: more important) and don't get any special standing in the internal APIs.

The suggested way of setting builder via confoverrides is still unwieldy. In this case, I'm sending in a dict via a kwarg, that has a string to a module, to get a side-effect of it running setup(), to register a builder for a required argument (buildername) in that same signature.

As long as there's nothing dynamic about your builder — and your previous, elaborate replies haven't indicated such — I don't see anything bad in having all _extensions to Sphinx_ live in a _Sphinx extension_ and then loading it from where you _instrument Sphinx._

How's Sphinx(confoverrides={'extensions': ['my_cool_builder']}) much different from from my_cool_builder import Builder; Sphinx(builderclass=Builder)?

I'm using the internal machinery of sphinx, [the documentation] won't be applicable to everyday users invoking sphinx-build.

I'd be sad if our documentation was so one-sided. There are guides for Extension developers which cover many of the internal APIs — among them also sphinx.application.Sphinx and sphinx.builders.Builder.

Funny enough, that very same document already speaks about your use case: _“This is what you can do in an extension: First, you can add new builders to support new output formats or actions on the parsed documents.”_

SQLAlchemy's declarative and classical mappings. These are two ways to do essentially the same thing.

This example is again ill-suited, as SQLAlchemy's declarative mappings are _on top of_ the classical mappings. I'm all for adding something (say, a stub extension that makes it easier to define setup() than shelling out to another file) _in addition;_ I'm just strongly against mixing and mashing the current API with more convenient layers.

lehmannro on 28 Apr 2016

FYI: I made an example program for registering a builder through confoverrides.
https://github.com/tk0miya/example_for_register_builder

I wish this helps you! (Sorry, I don't have enough time to discuss with you in this morning)

tk0miya on 28 Apr 2016

HEY! :smile:

This thread went off the hooook!

@tk0miya, Thank you for that example! No worries. I will follow up sensei.

@lehmannro I mailed you off list to clarify a few points.

tony on 28 Apr 2016

Now I close this issue.
Please reopen again if you want to discuss more.

Thanks,

tk0miya on 6 May 2016

I don't think the discussion here was finished, albeit the original goal of allowing "custom builders to be passed in" has now changed.

lehmannro on 6 May 2016

_(My reply from our off-list discussion.)_

My one and only concern with this PR is API design. The sphinx.application.Sphinx() object is part of our external contract with developers such as you, and isn't something hidden in a dark shed where we don't care about design questions. There are three reasons I consider this change bad, or at least trending towards a bad direction:

Clarity: The application API (this one constructor method we're talking about) is already quite convoluted. I would prefer if we found other ways to modify the object. An example could be deferring the builder initialization into Sphinx.build(), so that Sphinx() no longer needs either — a buildername or a builderclass.
Consistency: While I agree the builder is important, I like the way it is just another extension point currently. From an extension's view, it's no different to anything else — be it directives, parsers, or autodocumenters. I think if we further the specialness of builders, the same argument could be made for parsers tomorrow.
Convenience: I agree that your use case is legit; I don't think we should just follow the easiest path and tack on a bolt here and there to fix it quickly. _«Simple things should be simple, complex things should be possible.»_

Maybe if we made a wrapper object that allowed something like: Sphinx(confoverrides={'extensions': [InMemoryExtension(lambda app: app.add_builder(MyCoolBuilder)]})

Or, if we just allowed confoverrides= to contain a 'setup' key (like any other config, that's what confoverrides is for), like: Sphinx(confoverrides={'setup': lambda app: app.add_builder(MyCoolBuilder)})

lehmannro on 6 May 2016

An example could be deferring the builder initialization into Sphinx.build(), so that Sphinx()

I'd like to expand on that.

I've tried this last week and found it breaks a lot of tests. It seems mostly due to test infrastructure around the class assuming builder will be ready. There are so many cases where tests infer builder is already set up. It makes it difficult to separate if its merely breaking test behavior or could break other things.

What do you think about this approach? It looks to me like it may entail adjustments to quite a few tests.

I will look deeper into it next week and post more details.

tony on 7 May 2016

An example could be deferring the builder initialization into Sphinx.build(), so that Sphinx() no longer needs either — a buildername or a builderclass.

After peeking more deeply, I've created a proof of concept for moving self._init_builder to self.build. https://github.com/tony/sphinx/tree/defer-init-build

diff --git a/sphinx/application.py b/sphinx/application.py
index 5ec129d..c9ba884 100644
--- a/sphinx/application.py
+++ b/sphinx/application.py
@@ -195,8 +195,6 @@ class Sphinx(object):
         self._init_source_parsers()
         # set up the build environment
         self._init_env(freshenv)
-        # set up the builder
-        self._init_builder(self.buildername)
         # set up the enumerable nodes
         self._init_enumerable_nodes()

@@ -277,6 +275,8 @@ class Sphinx(object):
     # ---- main "build" method -------------------------------------------------

     def build(self, force_all=False, filenames=None):
+        # set up the builder
+        self._init_builder(self.buildername)
         try:
             if force_all:
                 self.builder.compile_all_catalogs()
diff --git a/tests/test_build_texinfo.py b/tests/test_build_texinfo.py

With two examples I've tried, one, my "programatic" access of Sphinx with the custom builder, as well as doing as make html with another sphinx project (tmux), both were able to build fine.

We've seem to have build our test framework around the assumption that builder will be ready in the app object. I wanted to create a temporary solution for fixing the tests by invoking app._init_builder(app.buildername) within the decorators that make our test app objects.

I'm confused why this fails:

diff --git a/tests/util.py b/tests/util.py
index 454cc0d..d224b49 100644
--- a/tests/util.py
+++ b/tests/util.py
@@ -254,6 +254,7 @@ def with_app(*args, **kwargs):
             kwargs['status'] = status
             kwargs['warning'] = warning
             app = TestApp(*args, **kwargs)
+            app._init_builder(app.buildername)
             try:
                 func(app, status, warning, *args2, **kwargs2)
             finally:
@@ -274,6 +275,7 @@ def gen_with_app(*args, **kwargs):
             kwargs['status'] = status
             kwargs['warning'] = warning
             app = TestApp(*args, **kwargs)
+            app._init_builder(app.buildername)
             try:
                 for item in func(app, status, warning, *args2, **kwargs2):
                     yield item

Attached is the output of python tests/run.py -m '^[tT]est' -I py35. test-output.txt.

But this is ok:

diff --git a/tests/util.py b/tests/util.py
index 454cc0d..9eb3417 100644
--- a/tests/util.py
+++ b/tests/util.py
@@ -215,10 +215,12 @@ class TestApp(application.Sphinx):
         self._saved_nodeclasses = set(v for v in dir(nodes.GenericNodeVisitor)
                                       if v.startswith('visit_'))

+
         try:
             application.Sphinx.__init__(self, srcdir, confdir, outdir, doctreedir,
                                         buildername, confoverrides, status, warning,
                                         freshenv, warningiserror, tags)
+            self._init_builder(self.buildername)
         except:
             self.cleanup()
             raise

My only irk wit the above is we're starting to accumulate special effects in the test objects which don't represent real conditions. The test code no longer represents how the real app object behaves.

All tests pass, except one which invoked app.builder.build_all() instead of app.build() a5016da:

======================================================================
ERROR: test_quickstart.test_quickstart_and_build
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/me/work/reStructuredText/sphinx/env/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/Users/me/work/reStructuredText/sphinx/env/lib/python2.7/site-packages/nose/util.py", line 620, in newfunc
    return func(*arg, **kw)
  File "/Users/me/work/reStructuredText/sphinx/tests/util.py", line 291, in new_func
    new_func.__name__ = func.__name__
  File "/Users/me/work/reStructuredText/sphinx/tests/test_quickstart.py", line 276, in test_quickstart_and_build
    app.builder.build_all()
AttributeError: 'NoneType' object has no attribute 'build_all'

tony on 9 May 2016

I'm swinging back to this and have to say, it feels really awkward and adhoc having to declare and extension to register a builder. I just want to be able to put one in.

This basically means I need to create a side effect to get something to work. For a Builder.

This creates more opaqueness and mystery behind a system that's already complex, I'm trying to make sense of so I can use it as utility and customize it.

If you're accessing sphinx programmatically, such as my case, I don't want to register things, I want to be able to put in a class for a builder because it's a duck typed language.

Extensions make a lot of sense when you customize your own configuration and build that way. They don't make much sense when you're overriding internal behavior and just want to get work done. Why go through all the effort to subclass builders and only make them accessible by what feels like a hack.

I'm really considering a fork or just dropping it from my new site. Not to be a pain, but I've been trying about 20 hours to just get translators print()ing. There's way too much separation from the person using the python API and actually getting work done overriding stuff.

tony on 13 Aug 2017

Was this page helpful?

0 / 5 - 0 ratings