Osrm-backend: How to deal with large quadratic matrices?

Created on 25 Aug 2018  路  7Comments  路  Source: Project-OSRM/osrm-backend

I need to calculate large quadratic matrices like 6000 x 6000. I understand that computing uses a lot of memory. What is solution? Is there best practice?

Most helpful comment

@TimMcCauley That's in the ballpark, yes. It'll vary a little bit on the length of the routes in your matrix, but not too much. Note that a single request only uses a single CPU core, so you could in theory split the large request into 4 smaller ones and run them concurrently (you'd need to stitch them together again) to reduce the overall time.

Note that in the example above, the 5000x5000 request takes about 45 seconds on a single core. The 5000x5000 request starts at the uptick around the 1 minute mark, and ends when it drops back down again at around the 1:45 mark.

All 7 comments

I would recommend splitting the computation up into several requests. For example four 3000x3000 or sixteen 1500x1500. That way you can limit the amount of memory needed to process a single request. Otherwise the only solution is to provision the machine with enough memory to do it in one go.

Thanks for your response.
Can you please give me small example or reference of how to divide large quadratic matrices to several smalls?
How can approximately calculate the necessary memory? For example for matrix 1000 x 1000?
Thanks again for your helping.

@ifle I have some code here that does this for the Mapbox Matrix API. The strategy would be very similar for doing it against OSRM, feel free to adapt that code.

That said, I'm able to execute 10000x10000 matrix calls on my laptop with 16GB of RAM just fine - are you operating in a very memory constrained environment?

Thanks for the sample. Very helpful.
We run osrm backend on Azure cluster with 5 Virtual Machines, 16GB of RAM

I did a little memory profiling on some larger matrices.

Looks like you can estimate the required memory for a large table request by multiplying the number of expected cells in the response by approximately 100 bytes.

screen shot 2018-08-29 at 11 59 28 am

Here, I ran 3 1k x 1k requests, and memory usage increased up to 75MB over baseline during the request. For a 5k x 5k request, it was about 2GB of additional memory needed.

@danpat thanks for sharing this; just to make sure I understand this graph correctly: a 5000x5000 matrix would take under 1.5 minutes to compute on a 16GB RAM (+ 4 cores?) machine? Spectacular.

@TimMcCauley That's in the ballpark, yes. It'll vary a little bit on the length of the routes in your matrix, but not too much. Note that a single request only uses a single CPU core, so you could in theory split the large request into 4 smaller ones and run them concurrently (you'd need to stitch them together again) to reduce the overall time.

Note that in the example above, the 5000x5000 request takes about 45 seconds on a single core. The 5000x5000 request starts at the uptick around the 1 minute mark, and ends when it drops back down again at around the 1:45 mark.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

pat841 picture pat841  路  4Comments

stvno picture stvno  路  3Comments

AliKarami picture AliKarami  路  3Comments

grib0 picture grib0  路  4Comments

RajibChanda picture RajibChanda  路  4Comments