Graphene: DataLoader for graphene? Caching and query efficiency

Created on 8 Jun 2016  路  7Comments  路  Source: graphql-python/graphene

Dan Schafer told us about DataLoader at React-Europe 16. But their implementation uses NodeJS. Is this something that would be possible in graphene?

DataLoader basically caches the queries and instead of doing request after request to the database it waits to get all the ids of the objects you want. Then you can call a select on these multiple ids instead of a select on ids one after the other.

Most helpful comment

An implementation of Dataloader is available to use inside Graphene: http://docs.graphene-python.org/en/latest/execution/dataloader/

All 7 comments

I feel that will be dependent on the backend you are using - the way this would work would be rather different for Django than for SQLAlchemy than for GAE. Django certainly provides alternative ways to optimise queries for relational databases.

This would be huge. Node makes it trivial thanks to the event loop and therefore process.nextTick() enabling this to have been built. Coming from writing Node graphql backends, I am terrified about performance because of the lack of DataLoader, but hoping things like select_related in django etc will help minimize this.

graphql-core already has implementations for executing fetches in parallel for things like gevent and asyncio. Many of these systems use an event loop. Maybe we could add batching support to these.

I've been looking into this too and wondering, but it also does seem super backend specific to me. SQLAlchemy handles stuff like this with eager loading strategies to batch everything into a single query using joined loads and similar:

http://docs.sqlalchemy.org/en/latest/orm/loading_relationships.html#using-loader-strategies-lazy-loading-eager-loading

@nickhudkins / @bigblind How would the event loop help here? You could do it in parallel, but you're still spending the handshake / connection management / data serialization overhead of talking to the database server n times, which in all likelihood would be causing most of the slowness, not to mention spamming your db server with connections, which it may have a cap on. That would in effect mean that even though the client is doing the queries in parallel, since the db server only handles so many at a time, it degrades into being serial-ish.

Is this issue solved now?

Not using graphene anymore, I figured that using NodeJS would be the best solution.
Feel free to close this issue if @makmanalp 's answer is satisfactory.

An implementation of Dataloader is available to use inside Graphene: http://docs.graphene-python.org/en/latest/execution/dataloader/

Was this page helpful?
0 / 5 - 0 ratings

Related issues

lincolnq picture lincolnq  路  3Comments

tricoder42 picture tricoder42  路  4Comments

romaia picture romaia  路  3Comments

nsh87 picture nsh87  路  3Comments

mraak picture mraak  路  3Comments