Currently OSRM loads (almost) everything into RAM and accesses the memory structures via the Datafacade interface.
We're looking at what it might look like to operate OSRM in a limited-memory environment - reading data directly from disk rather than from in-memory structures. CH is great for this - we only need a small number of reads to find the route.
However, the remaining work that OSRM does is currently not very optimized - unpacking geometries, retrieving names, etc.
We should review the data access patterns when finding routes - cluster related data nearby where possible, and limit round-trips when fetching results.
This would improve RAM-based access as well - higher-speed, RAM-vs-CPU cache is the same kind of thing - data locality will improve performance.
osrm on mobile devices?
@emiltin I'm aiming for an Arduino.....jk!
@danpat i am trying to make 32-bit compilable osrm-. if it is interesting i will create a PR when i will finish, because at the moment some tests crash and i need more time to investigate
@oxidase Awseome, yes, that's on our radar too, I just haven't written up a ticket for it yet. Patches always appreciated!
@danpat the whole planet can be used for routing also with 32bit applications. It is possible by clustering of data with the r-tree in pages of "megabytes" size and make only memory maps with mmap64 for the route related data. It will also improve data locality in memory.
@oxidase data with locality, like the rtree, yeah, that should work just fine.
My biggest concern is the .hsgr file (the contracted graph file) - for large road networks, this file can be >4GB, and there is very poor data locality (traversing the contracted graph will lead to semi-random access to that file). I suspect we're just going to have to suffer the performance cost on that one.
@danpat one way we could do this is to split the contracted graph into a forward and a reverse graph. This will in total need more memory, but each part should be significantly smaller and we could do forward/reverse search in batches to increase locality and not need both parts in memory at the same time.
We should re-investigate the cache performance here. We added node ID renumbering in osrm-partition that should ensure (even for CH) that we mostly hit the lower IDs during a query.
Additionally we can investigate only putting parts of our data on disk by further segmenting the shared memory storage.
Most helpful comment
@danpat i am trying to make 32-bit compilable osrm-. if it is interesting i will create a PR when i will finish, because at the moment some tests crash and i need more time to investigate