This feature request is mostly a performance optimisation. I don't know how CH scales with distance. Are shorter paths faster than longer paths?
With the table service, I'm running 1 source x 10000 destinations to sort the destinations by travel time ascending. I want to disregard all destinations that are further than x minutes away. Of course, I can do this with my own code. However, it seems rather wasteful since usually only 50 / 10000 will be in a "reasonable" travel time.
The purpose of CH is precisely to reduce computing times even for long-distance routes (whether for a route or table request). So the question is mostly how table requests scale based on the number of points, rather than scaling based on the underlying route distances.
Not providing all your destinations in input definitely makes sense if most of them are "too far anyway". One option would be to only submit the good candidates with a filtering on your side based on crow-fly distance. This would mean using an upper bound for travel speed and your own estimation of what is a "reasonable" travel time.
Thank you for answering! Does the table service scale linearally with the number of points?
Sadly, using the crow-fly distance as a heuristic fails on large bodies of water, elevation loops, etc. Any other suggestions?
On a side note, isn't crow-fly distance filtering the purpose of radiuses? If it is, does it improve performance? If not, what is the purpose of that request option?
Does the table service scale linearally with the number of points?
I don't know about the exact complexity but it's definitely more than linear. If you can, you'd probably be better off removing outlier destinations.
Sadly, using the crow-fly distance as a heuristic fails on large bodies of water, elevation loops
Sure, but I don't mean using this as a real estimate of travel time, but to provide a lower bound for travel time in order to filter out some destinations.
I want to disregard all destinations that are further than x minutes away
Say v_max is the max speed used on your profile and you want to rule out destinations that are further than T (your estimate of a "realistic" travel time). Then all destinations that have a crow-fly distance over v_max x T are guaranteed to require more than T in travel time, so you can rule them out.
I'll also note that it'd be worth simply experimenting. I've done quite a bit of benchmarking on various table sizes, and I can say that route calculation doesn't really dominate the computation time until the matrices are quite large (like 10000x10000). Prior to that, the constant factor for each location (coordinate lookup in the spatial index) tend to dominate request times. A 10k x 10k table request is approximately 100,000,000 routes calculated, and it does that in about 8 seconds on my laptop. For a 1x10000 matrix, that's only, well, 10,000 routes calculated, so you can expect the route calculation part to only really contribute about 0.0008 seconds to the total response time. Note that OSRM still has to do spatial lookups on the 10,000 coordinates, so that factor will likely dominate.
CH makes most path calculations extremely cheap (at most a few hundred edges explored even for cross-country routes). If it were me, I probably won't bother to pre-optimize this step in a system until it proved to be a major point of performance loss.
The same can't be said for other routing algorithms (A*, etc) where route calculations can get really expensive - but CH turns out to be insanely fast most of the time.
Very interesting! If my destinations rarely change, is it possible to amortize the long/lat lookups?
Yes. Take a read of the docs and look at the hints= parameter. In the API response, in the sources= and destinations= objects, will be a hint property. If you pass this value back in using the hints=<hint1>;<hint2>;<hint3> parameter, it'll skip the spatial lookup step and just use the result you supply. Note that the hints are designed to be opaque, and are only valid if retrieved from OSRM using exactly the same datafiles (i.e. if you regenerate your map, your hints are invalidated).
Ah so that's what the hints are for! Thank you very much :D