Or-tools: Improve python detailed documentation

Created on 5 Sep 2018  Â·  15Comments  Â·  Source: google/or-tools

Congratulations for this python wrapper optimization package. In my close data science community I've heard several times about it, with positive reviews, so I thought about using it for my next project.

However, It is being very difficult for me (and this is a shared feeling with my data science colleagues, they all complain about this) to understand all the methods, due to the fact that there isn't any full reference documentation. Only those with previous experience in optimization by using code (such as C) have had a good transition to or-tools.

The examples are great, but I can't catch up with the parameters that are being passed at every function, as there is no explanation about them. If I inspect the modules, I can't see any reference to parameters, as they're all classes pointing to something else (maybe this is not completely true, I'm not software engineer so I can't follow it). Another example: google recommends SAT Constraint over the normal CP module, but there is no example about SAT CP. How can I learn it without docs?

I've done some research and found this but it still doesn't accomplish what documentation normally does. By going to the full reference here I can only see reference for C++ that has been generated automatically and is hard to follow.

For example, Python Constraint, which is a smaller package compared to or-tools, has a concise but great documentation where a newcomer can easily have a landscape of all constraints and examples Python Constraint docs
Is there something like it in or-tools?

I feel like both Python and Optimization are growing in terms of demand, and the Python community doesn't have yet a "go-to" optimization library (as happens with machine learning), so or-tools could be it. However, the docs could stop some of the newcomers of learning this great package

Feature Request Python

Most helpful comment

Above all, I think the package is very good (and that's why I have been recommended several times to use it). The only issue I see is the difficulty to learn due to the lack of docs. For example, today I needed to look for external help in order to learn the CP solver usage, and found this article. This article was very helpful, but the guy experienced the same as me when he started:

The other day I pip installed google's or-tools and started madly optimizing every facet of my life, using constraint programming. Er, well, actually I've been futzing about solving toy problems and trying to figure out how to use or-tools?. You see or-tools is written in C++ and the python interface is a pretty basic wrapper. It is not particularly self-documenting (pythonically), and the official docs and examples are only very basic models in python

There are lots of more complex features that are not clearly explained in the python examples, and it isn't always obvious how to use them from the docstrings, so I'm going to take a moment and run through some of my hiccups.

Orwant, some specific issues are:

-There are no Python docs I can read to get started with or-tools. Then, In order to learn, I need to follow along your examples (which are great, by the way). While coding in the editor and following those examples, I don't know what parameters does each function have. Inspection, autocompletion or IPython docs (which are used for example in Jupyter and Spyder) won't show anything. For example, in Sklearn:
image

While in or-tools, nothing will show up, even with inspection:

image

Then I need to go to the file provided by Laurent, and then I understand what parameters does each function have. Great!

def NewIntVar(self, lb, ub, name):
    """Creates an integer variable with domain [lb, ub]."""
    return IntVar(self.__model, [lb, ub], name)

The problem is that now I don't know what types of variables should be passed to each parameter. How do I know if the arguments to be passed to lb or ub are strings? How do I know if they're other Solver method or even another or-tools class?
Then I need to go to go to the examples, and only then I realize that I should be passing the solver.Infinity() element for the upper bound (how could I have known this by the docs?).

image

Tensorflow docs, for example, state very clearly the parameters and arguments (types of element, defaults and options in strings) that should be passed to the tf.Variable element:

image

Another problem is easily finding methods/attributes of or-tools classes. For example, in LightGBM (popular Microsoft gradient boosting package) docs it is easy to learn the package even from inside the editor ( it is the same for almost all popular data science packages in python):

image

In my opinion, the docs should contain the following points for each function or method of solver:
-list of arguments with defaults
-types/possibilities of arguments (float, solver.Infinity method, boolean, string, etc..)
-toy examples

In addition, the docs should be organized from a functional perspective. By this I mean making a docs page only including all global constraints available (for example, the Python Constraint package I stated before). In or-tools I can only find all methods for the solver class mixed all together without explanations and without any structure.
it should contain as well things like another page with examples of how to combine constraints (I realized that the OR (|) boolean operation needs to be done with the solver.MAX() method after a long research on external resources)

I know this is not an easy request and it requires a lot of resources to be done. However, there are several recent cases of python data science packages becoming very popular because of their good docs, and right now it seems like the python-optimization space is looking for a go-to library

All 15 comments

For CP-SAT,

Please have a look at:
https://github.com/google/or-tools/blob/master/ortools/sat/doc/index.md

Furthermore, all python API for this solver is contained here:
https://github.com/google/or-tools/blob/master/ortools/sat/python/cp_model.py
This is a good reference manual. Just look at the CpModel class.

You can run pydoc on it, I have not yet found how to transform that into
nice documentation I can embed.

At least, it should get you started.
Laurent Perron | Operations Research | [email protected] | (33) 1 42 68 53
00

Le mer. 5 sept. 2018 à 17:40, David Olmo Pérez (Dadv) <
[email protected]> a écrit :

Congratulations for this python wrapper optimization package. In my close
data science community I've heard several times about it, with positive
reviews, so I thought about using it for my next project.

However, It is being very difficult for me (and this is a shared feeling
with my data science colleagues, they all complain about this) to
understand all the methods, due to the fact that there isn't any full
reference documentation. Only those with previous experience in
optimization by using code (such as C) have had a good transition to
Ortools.

The examples are great, but I can't catch up with the parameters that are
being passed at every function, as there is no explanation about them. If I
inspect the modules, I can't see any reference to parameters, as they're
all classes pointing to something else (maybe this is not completely true,
I'm not software engineer so I can't follow it). Another example: google
recommends SAT Constraint for CP, but there is no example about SAT CP. How
can I learn it without docs?

I've done some research and found this
https://github.com/google/or-tools/issues/172 but it still doesn't
accomplish what documentation normally does. By going to the full reference
here
https://developers.google.com/optimization/reference/constraint_solver/constraint_solver/
I can only see reference for C++ that has been generated automatically and
is hard to follow.

For example, Python Constraint, which is a smaller package compared to
Ortools, has a concise but great documentation where a newcomer can easily
have a landscape of all constraints and examples Python Constraint docs
http://labix.org/doc/constraint/
Is there something like it in Ortools?

I feel like both Python and Optimization are growing in terms of demand,
and the Python community doesn't have yet a "go-to" optimization library
and Ortools could be it. However, the docs could stop some of the newcomers
of learning this great package

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/google/or-tools/issues/842, or mute the thread
https://github.com/notifications/unsubscribe-auth/AKj17UtL44e40mesWADoLwFy0ltL77cCks5uX_BogaJpZM4WbLaK
.

Laurent, thanks for your answer. The cp_model.py helps (it is feasible) but I think it could improve (and reach optimal value :-) )

Let me show what is my mental process (and possibly the same for newcomers) when reading the docs:
For example, if I want to initialize a new IntVar, I would go to the NewIntVar method:

  def NewIntVar(self, lb, ub, name):
    """Creates an integer variable with domain [lb, ub]."""
    return IntVar(self.__model, [lb, ub], name)

Then, I would see that it needs de "lb","ub" and "name" parameters. As there is no explanation, at first I don't know what those are. Then I go to the examples and see the IntVar method in action, and I deduce that those refer to the lower bound and upper bound. Then, I don't know how to state the bounds, and what are the different possibilities of each argument, so I would need to go to other examples and see this method in action.
In order to just declare one IntVar (which should be the easiest part), I would require extensive research at first to just understand how to use it

I know this might be seen as a "picky" request and this might not be needed at all. I just want to share my personal and some other colleagues experience when starting to use or-tools, and to highlight a barrier that might make people stop using this package when they're starting

Sorry, I do not get your specific point. I know the doc is meant for people familiar with OR (CP, MIP) modeling. But on this particular case, the docstrings says:
Creates an integer variable with domain [lb, ub]
Why is this not a clear definition of lb, ub?

David, is your issue a) that you had to visit the examples to understand how to invoke NewIntVar(), or is it b) that even after visiting the example, you don't know the full semantics (e.g., how large the upper bound can be, or what restrictions there are when naming the variable)?

I do think that we should rename "lb" to lower_bound" and "ub" to upper_bound, but beyond that I'm unclear what you think should be done.

(Note that the use of square brackets in [lb, ub] indicates inclusive bounds, e.g., >= and <=; this is a mathematical convention.)

And I just push a markdown reference file for the python SAT-CP interface:

https://github.com/google/or-tools/blob/master/ortools/sat/doc/reference.md

Not perfect, but this is the first try. I used pydoc-markdown for that.
Laurent Perron | Operations Research | [email protected] | (33) 1 42 68 53
00

Le mer. 5 sept. 2018 à 22:26, Jon Orwant notifications@github.com a
écrit :

David, is your issue a) that you had to visit the examples to understand
how to invoke NewIntVar(), or is it b) that even after visiting the
example, you don't know the full semantics (e.g., how large the upper bound
can be, or what restrictions there are when naming the variable)?

I do think that we should rename "lb" to lower_bound" and "ub" to
upper_bound, but beyond that I'm unclear what you think should be done.

(Note that the use of square brackets in [lb, ub] indicates inclusive
bounds, e.g., >= and <=; this is a mathematical convention.)

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/google/or-tools/issues/842#issuecomment-418869784,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AKj17bL5fXvwHS9BJjwFn_aOgIQqrBkvks5uYDOOgaJpZM4WbLaK
.

Above all, I think the package is very good (and that's why I have been recommended several times to use it). The only issue I see is the difficulty to learn due to the lack of docs. For example, today I needed to look for external help in order to learn the CP solver usage, and found this article. This article was very helpful, but the guy experienced the same as me when he started:

The other day I pip installed google's or-tools and started madly optimizing every facet of my life, using constraint programming. Er, well, actually I've been futzing about solving toy problems and trying to figure out how to use or-tools?. You see or-tools is written in C++ and the python interface is a pretty basic wrapper. It is not particularly self-documenting (pythonically), and the official docs and examples are only very basic models in python

There are lots of more complex features that are not clearly explained in the python examples, and it isn't always obvious how to use them from the docstrings, so I'm going to take a moment and run through some of my hiccups.

Orwant, some specific issues are:

-There are no Python docs I can read to get started with or-tools. Then, In order to learn, I need to follow along your examples (which are great, by the way). While coding in the editor and following those examples, I don't know what parameters does each function have. Inspection, autocompletion or IPython docs (which are used for example in Jupyter and Spyder) won't show anything. For example, in Sklearn:
image

While in or-tools, nothing will show up, even with inspection:

image

Then I need to go to the file provided by Laurent, and then I understand what parameters does each function have. Great!

def NewIntVar(self, lb, ub, name):
    """Creates an integer variable with domain [lb, ub]."""
    return IntVar(self.__model, [lb, ub], name)

The problem is that now I don't know what types of variables should be passed to each parameter. How do I know if the arguments to be passed to lb or ub are strings? How do I know if they're other Solver method or even another or-tools class?
Then I need to go to go to the examples, and only then I realize that I should be passing the solver.Infinity() element for the upper bound (how could I have known this by the docs?).

image

Tensorflow docs, for example, state very clearly the parameters and arguments (types of element, defaults and options in strings) that should be passed to the tf.Variable element:

image

Another problem is easily finding methods/attributes of or-tools classes. For example, in LightGBM (popular Microsoft gradient boosting package) docs it is easy to learn the package even from inside the editor ( it is the same for almost all popular data science packages in python):

image

In my opinion, the docs should contain the following points for each function or method of solver:
-list of arguments with defaults
-types/possibilities of arguments (float, solver.Infinity method, boolean, string, etc..)
-toy examples

In addition, the docs should be organized from a functional perspective. By this I mean making a docs page only including all global constraints available (for example, the Python Constraint package I stated before). In or-tools I can only find all methods for the solver class mixed all together without explanations and without any structure.
it should contain as well things like another page with examples of how to combine constraints (I realized that the OR (|) boolean operation needs to be done with the solver.MAX() method after a long research on external resources)

I know this is not an easy request and it requires a lot of resources to be done. However, there are several recent cases of python data science packages becoming very popular because of their good docs, and right now it seems like the python-optimization space is looking for a go-to library

Thank you, David. Your explanation is very helpful, and we'll put it to good use as we expand the documentation.

Yes, thank you.
Laurent Perron | Operations Research | [email protected] | (33) 1 42 68 53
00

Le jeu. 6 sept. 2018 à 00:17, Jon Orwant notifications@github.com a
écrit :

Thank you, David. Your explanation is very helpful, and we'll put it to
good use as we expand the documentation.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/google/or-tools/issues/842#issuecomment-418899604,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AKj17bePXRwvDHpX1j4ZAuVhuNTbVoIPks5uYE1lgaJpZM4WbLaK
.

So I checked intellisence,

On jupyter, it works, but it is quite brittle, I had to restart the kernel
multiple times.
On visual studio code, it is very reliable, but the display forgets all
formatting.

Laurent Perron | Operations Research | [email protected] | (33) 1 42 68 53
00

Le jeu. 6 sept. 2018 à 00:04, David Olmo Pérez (Dadv) <
[email protected]> a écrit :

Above all, I think the package is very good (and that's why I have been
recommended several times to use it). The only issue I see is the
difficulty to learn due to the lack of docs. For example, today I needed to
look for external help in order to learn the CP solver usage, and found this
article
http://daft.engineer/hacks-and-kludges/stumbling-through-or-tools/.
This article was very helpful, but the guy experienced the same as me when
he started:

The other day I pip installed google's or-tools and started madly
optimizing every facet of my life, using constraint programming. Er, well,
actually I've been futzing about solving toy problems and trying to figure
out how to use or-tools?. You see or-tools is written in C++ and the python
interface is a pretty basic wrapper. It is not particularly
self-documenting (pythonically), and the official docs and examples are
only very basic models in python

There are lots of more complex features that are not clearly explained in
the python examples, and it isn't always obvious how to use them from the
docstrings, so I'm going to take a moment and run through some of my
hiccups.

Orwant, some specific issues are:

-There are no Python docs I can read to get started with or-tools. Then,
In order to learn, I need to follow along your examples (which are great,
by the way). While coding in the editor and following those examples, I
don't know what parameters does each function have. Inspection,
autocompletion or IPython docs (which are used for example in Jupyter and
Spyder) won't show anything. For example, in Sklearn:
[image: image]
https://user-images.githubusercontent.com/33136768/45121676-1ca09f00-b162-11e8-9470-67cdbc83b1e6.png

While in or-tools, nothing will show up, even with inspection:

[image: image]
https://user-images.githubusercontent.com/33136768/45120805-84091f80-b15f-11e8-9e82-d1e86291e1f3.png

Then I need to go to the file provided by Laurent, and then I understand
what parameters does each function have. Great!

def NewIntVar(self, lb, ub, name):
"""Creates an integer variable with domain [lb, ub]."""
return IntVar(self.__model, [lb, ub], name)

The problem is that now I don't know what types of variables should be
passed to each parameter. How do I know if the arguments to be passed to lb
or ub are strings? How do I know if they're other Solver method or even
another or-tools class?
Then I need to go to go to the examples, and only then I realize that I
should be passing the solver.Infinity() element for the upper bound (how
could I have known this by the docs?).

[image: image]
https://user-images.githubusercontent.com/33136768/45123141-33e18b80-b166-11e8-90df-866898776f3e.png

Tensorflow docs, for example, state very clearly the parameters and
arguments (types of element, defaults and options in strings) that should
be passed to the tf.Variable element:

[image: image]
https://user-images.githubusercontent.com/33136768/45123434-2bd61b80-b167-11e8-85f1-7bfcb45cefb5.png

Another problem is easily finding methods/attributes of or-tools classes.
For example, in LightGBM (popular Microsoft gradient boosting package) docs
it is easy to learn the package even from inside the editor:

[image: image]
https://user-images.githubusercontent.com/33136768/45123660-e5cd8780-b167-11e8-8d38-cd48bd2e769f.png

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/google/or-tools/issues/842#issuecomment-418896663,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AKj17ZX889ej-d6dU8pRRP5zkpbkjM_uks5uYEp8gaJpZM4WbLaK
.

There are two kinds of Python file:

  1. Hand crafted python file:
    i.e. all file in ortools/*/python/*.py
    e.g. https://github.com/google/or-tools/blob/master/ortools/sat/python/cp_model.py
    For Sat @lperron is on his way to have a clean doctring that any clever IDE should be able to parse
    and you can doxygen cp_model.py (or pydoc?) it to generate HTML API reference...

  2. SWIG python wrapper API
    i.e. all file generated during build (make python) in ortools/gen/ortools/*/pywrap*.py
    e.g. ortools/gen/ortools/linear_solver/pywraplp.py
    Currently they don't contain any documentation since swig only generate the raw wrapper but we should be able to generate a minimal numpy doc style docstring to get type for each parameters...
    ref: http://www.swig.org/Doc3.0/SWIGDocumentation.html#Python_nn65

Thank you all for this. This will be a huge step forward for newcomers in optimization like me that want to stick to a reliable tool from the start

One problem I had while reading the source is that it is rather hard for someone who never used swig to understand what methods are actually called on solver.solve().
If you don't want to use or-tools as a black box, some schematics (flow charts) would be helpful. They don't have to be in detail, just showing the main function calls. Also some flow charts explaining the implementation of the solvers (like search heuristics, what happens if you have multiple strategies etc.) would be nice to have. It would stop people like me asking a lot of questions :)

Quick milestone,

There are plenty of examples for the CP-SAT and documentation

we are

I consider this closed.

Was this page helpful?
0 / 5 - 0 ratings