Osrm-backend: Cache-friendly data access during routing

Created on 3 Jun 2016  路  8Comments  路  Source: Project-OSRM/osrm-backend

Currently OSRM loads (almost) everything into RAM and accesses the memory structures via the Datafacade interface.

We're looking at what it might look like to operate OSRM in a limited-memory environment - reading data directly from disk rather than from in-memory structures. CH is great for this - we only need a small number of reads to find the route.

However, the remaining work that OSRM does is currently not very optimized - unpacking geometries, retrieving names, etc.

We should review the data access patterns when finding routes - cluster related data nearby where possible, and limit round-trips when fetching results.

This would improve RAM-based access as well - higher-speed, RAM-vs-CPU cache is the same kind of thing - data locality will improve performance.

Most helpful comment

@danpat i am trying to make 32-bit compilable osrm-. if it is interesting i will create a PR when i will finish, because at the moment some tests crash and i need more time to investigate

All 8 comments

osrm on mobile devices?

@emiltin I'm aiming for an Arduino.....jk!

@danpat i am trying to make 32-bit compilable osrm-. if it is interesting i will create a PR when i will finish, because at the moment some tests crash and i need more time to investigate

@oxidase Awseome, yes, that's on our radar too, I just haven't written up a ticket for it yet. Patches always appreciated!

@danpat the whole planet can be used for routing also with 32bit applications. It is possible by clustering of data with the r-tree in pages of "megabytes" size and make only memory maps with mmap64 for the route related data. It will also improve data locality in memory.

@oxidase data with locality, like the rtree, yeah, that should work just fine.

My biggest concern is the .hsgr file (the contracted graph file) - for large road networks, this file can be >4GB, and there is very poor data locality (traversing the contracted graph will lead to semi-random access to that file). I suspect we're just going to have to suffer the performance cost on that one.

@danpat one way we could do this is to split the contracted graph into a forward and a reverse graph. This will in total need more memory, but each part should be significantly smaller and we could do forward/reverse search in batches to increase locality and not need both parts in memory at the same time.

We should re-investigate the cache performance here. We added node ID renumbering in osrm-partition that should ensure (even for CH) that we mostly hit the lower IDs during a query.

Additionally we can investigate only putting parts of our data on disk by further segmenting the shared memory storage.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

MouadSb picture MouadSb  路  3Comments

ltsstar picture ltsstar  路  5Comments

JamesLawrenceGSI picture JamesLawrenceGSI  路  3Comments

koussaimb picture koussaimb  路  4Comments

grib0 picture grib0  路  4Comments