Or-tools: CP-SAT Solver Algorithm Description?

Created on 16 Jun 2020 · 3Comments · Source: google/or-tools

What language and solver does this apply to?
All

Describe the problem you are trying to solve.
Hello! I would have posted this to the Google group, as mentioned in the Issues types, but it doesn't look like general users have permission.

I have a Binary Integer Linear program I have solved using CP-SAT. I am at the point where I'm experimenting with the scalability of my problem. Not surprisingly, the problem has constraints that quickly cause the problem to increase in the number of variables as the problem size increases. I've experimented with returning the best solution found with the time limit option. I've found that, if the problem size is big enough it returns Feasible for the solutions found. If I give the solver 10 minutes vs. 60 minutes, a solution that achieved the same objective value is often found for both. In some cases I know it is the optimal solution.

To add a little more, because my problem aims to minimize the time to achieve a set of tasks problem (not job shop scheduling, but sort of similar), I can give it more vs. less time steps in my experiments. I have found for some problem sizes, it will find the optimal with a lower number of time steps given and just return Feasible for a higher number, even though it has returned a solution with the same objective value.

Essentially, in some cases it returns Feasible for cases where I know the solution is optimal. That's completely find and is a behavior that makes sense to me.

What I'm wondering is, do you have any documentation that describes the CP-SAT solver's algorithm and how it has proved it has reached optimality vs. believes it has a feasible solution only? I've found lots of documentation around the various code options, which is why I was experimenting with the time limit. I'd like to make some sort of educated decision for my customer's implementation on how to decide when to let the solver search further vs. stop because it has a "good enough" solution. Or is there an option to stop searching if the best solution found so far has not improve in X minutes.

Describe the solution you'd like
See the last paragraph of the section above this.

Additional context
I can't describe my problem in detail, but imagine that we don't know the min time to complete a set of tasks and the number of time steps is an input to the problem. In some instances, for example, suppose at 10 time steps the problem finds the optimal solution in 30 minutes and gives a solution where all tasks are completed in 8 times steps, so I know the optimal solution is 8. If I give the problem 15 time steps, at 60 minutes it will return a solution with objective value of 8, but return that it only knows it is feasible. These numbers are examples only to provide an idea of the problem. I just want to know how the algorithm works so I can decide the best course of action for my business context, possibly just stopping when I have a solution that meets my needs or is close to optimal.

Help Needed CP / CP-SAT Solver

Source

mlk621

Most helpful comment

Unfortunately, there are no good answers.

Let's state the obvious.

To understand how the solver works, read Search is dead

These are NP-Hard problems. So not tractable in general.
The solver does two things in parallel. Try to find good solutions, and try to improve the lower bound of the objective (let's assume we are minimizing).

Search will terminate itself one of the two conditions are met:

It has explored all the search space
or it has improved the lower bound enough such that it is the same as the best solution found.

this is a bit simplistic as the solver is continuously trying to reduce the search space, either by improving the lower bound, or by learning conflicts, while at the same time exploring the search space.

But it does not answer your question. The answer is I don't know.
Usually, I look at the rate of improvements, and when it slows down, it usually continues to slow down, thus stopping might be a good idea.

But there are counter examples. So my advice is to use a large data set of problems and build your understanding of the problem.

The key notions are the following:

how easy it is to find a solution ? Some problems have very few solutions, so finding improving ones are very rare. In that case, the above heuristics is not a good signal. Some problems like the pure jobshop have lot of solutions if the objective is not very constrained, and much fewer when close to the equality. It all depends.
how is the 'retro-propagation' from the objective to the problem. Usually, makespan objective demonstrates good retro-propagation as they compresses the schedule. On the other hard, early tardy costs have very bad one, because bounding the cost has little effect on the problem. In the MIP world, this roughly translates in the quality of the linear relaxation.

I hope this helps.

lperron on 16 Jun 2020

👍2

All 3 comments

Unfortunately, there are no good answers.

Let's state the obvious.

To understand how the solver works, read Search is dead

Search will terminate itself one of the two conditions are met:

It has explored all the search space
or it has improved the lower bound enough such that it is the same as the best solution found.

But there are counter examples. So my advice is to use a large data set of problems and build your understanding of the problem.

The key notions are the following:

how easy it is to find a solution ? Some problems have very few solutions, so finding improving ones are very rare. In that case, the above heuristics is not a good signal. Some problems like the pure jobshop have lot of solutions if the objective is not very constrained, and much fewer when close to the equality. It all depends.
how is the 'retro-propagation' from the objective to the problem. Usually, makespan objective demonstrates good retro-propagation as they compresses the schedule. On the other hard, early tardy costs have very bad one, because bounding the cost has little effect on the problem. In the MIP world, this roughly translates in the quality of the linear relaxation.

I hope this helps.

lperron on 16 Jun 2020

👍2

I have found for some problem sizes, it will find the optimal with a lower number of time steps given and just return Feasible for a higher number, even though it has returned a solution with the same objective value.

If I understand correctly, you expanded the domains of some decision variables. Something like, expand the timeframe jobs can be scheduled by +/- X time-steps. The result was the same solution but not proven to be optimal. Often when you increase domains in a problem, there will be less propagation (removing bad results) -> more stuff to search -> more time needed to find solutions/prove optimality. This is often referred to as overconstrained (few possible solutions), underconstrained (many possible solutions) and wellconstrained (just right) in CP literature.

Besides trivial problems, practical stop conditions are in my experience determined through a combination of testing things out, the resources at your disposal and the requirements of clients. For example, if you're scheduling something for tomorrow it needs to be done by then - this sets an upper time-limit. Maybe you'd like to run several problems before tomorrow on the same machine - the time-limit needs to be divided by the problems. After you've run enough instances you find that the solution is rarely improved after X time, then you can consider adapting the time-limit to conserve resources.

is there an option to stop searching if the best solution found so far has not improve in X minutes

I haven't done this in the OrTools CP-SAT solver but I imagine you can register a callback upon finding a solution that stores the solution and the time it was found. Then periodically check how much time has passed since the last solution and cancel the search after a specified amount of time.

I can also recommend Handbook of Constraint Programming if you're looking for a more general resource that explain CP concepts and terminology.

CervEdin on 16 Jun 2020

👍1

Thank you both @lperron and @CervEdin !! It is good to at least feel I'm on the right track. My background is in LP & MIP techniques. I have only recently been venturing into Constraint Programming, and found it performs much, much better for my particular problem, so have been trying to find learn the general techniques, etc. I had some across the Search is dead link previously, but had not read it in great detail.

I do know my problem is NP-Hard as just from one set is the Set partitioning constraint set, along with a few others. I was actually surprised for my context set though that I am able to get solutions for the size of problem I need. Of course we'd like to increase it though, so knowing some of these tricks of perhaps "right-sizing" the problem is good.

Luckily my problem has many solutions and a lot of symmetry, so I think that helps me here. As for increasing the time steps translating to increasing the domain, that is exactly true. Basically, I give the problem more time in which to possibly pick to do each task. Good to know I'm on the right track with the approach I was thinking, which is testing some various time limits and deciding the best given what I observe with the various problem setups. I wanted to make sure there weren't some things I was missing!

Thanks!!

mlk621 on 16 Jun 2020

Was this page helpful?

0 / 5 - 0 ratings