Currently we have RPC state_getKeys which returns all keys with a same prefix, this may cause trouble for example trying to iterate all the account free balances.
Can we have state_nextKey that is basically sp_io::storage::next_key so SDK can control the iteration process.
And maybe we should eventually remove state_getKeys because it can be a trouble maker.
I'm not sure this is the right way, before we had huge latency problems because the ui needed to query each key. Now you are saying that query all at once is a problem? Why? Because the result set could be to big?
I think we should not introduce getNextKey. We should make getKeys iteratable, probably by returning batches of the final result.
Without enforce some limitation on state_getKeys, it can make the node returns every keys in the state, which I think will be an issue later.
Batch return will work, it will basically be state_getNextKeys that takes a key, a number, and return that many number of keys. Then the node can add a max keys to return to ensure the RPC method doesn't take forever to return. This plays nicely with paging because the last key of previous call will be the input of next call. Otherwise you will need a way to perform paging for getKeys.
With get keys, the UI have to wait for the node to return all the keys and then start displaying them, with getNextKey, the UI can start showing things much faster.
~I'm with @bkchr here,~ the idea of state_getNextKeys is quite nice, ~but it seems that the only way we can implement that is to iterate over all past keys to find the one we should start sending, which makes the whole method quite inefficient (i.e. The further you go, the slower the RPC will respond).~
Actually, NVM me. Looking at the PR I realised it's definitely possible to seek to a key in the trie.
@bkchr do you have an idea how the API could look like? We could have additional optional parameter from: Option<Key>, but it seems to me that a separate RPC method state_getKeysAfter()/state_getNextKeys() seems like a good idea. I'd consider deprecating state_geyKeys as indeed it may kill the node if a too large query is sent.
Can we agree a RPC interface before I start working on #4718 again?
My proposal is:
Change arguments of state_getKeys to key_prefix: StroageKey, hash: Option<Hash>, start_key: Option<StorageKey>, count: Option<u32>
key_prefixstart_key is None, key_prefix will be usedcount keys will be returned, None means no limit, later maybe we want to make it default to some sensible value and enforce a max value, but no need to happen in this PRI think state_getNextKey is still a useful primitive method, but let me know if I should keep this or not. I see no downside of having this other than more code to be maintained and documented.
@bkchr @tomusdrw
@xlc I'd keep only one method long term. I think the best option would be to:
state_getKeysPaged(prefix: StorageKey, count: u32, start_key: Option<StorageKey>, hash: Option<Hash>), reasoning for parameters order: count should be mandatory and we should return an error on big values (say above 10k), hash should stay as the last argument to be aligned with other methods.state_getNextKey to avoid endorsing iteration one-by-one. The same can be done via new state_getKeysPaged if really needed.state_getKeys, implement it using the new method and keep for couple of versions, then remove entirely.I'm not sold on state_getKeysPaged, but don't have a better idea for now. After we remove state_getKeys we can probably go back to the old name.
Sounds good, however I also don't like the idea of state_getKeysPaged. Ideally the RPC could be some stream that only returns a new page when being "polled". I think currently the jsonrpc-crate doesn't support Streams and I'm not sure if that even is feasible.
This isn't subscripting future events, so server can only return data upon request. Then the request is basically just calling state_getKeysPaged. I can't think a better way. Also it is good to keep the server as stateless as possible (i.e. do not store paging details on server side), so load balancing won't cause any troubles.
@bkchr Yeah, it's more a limitation of a stateless JSON-RPC protocol that we are using, we already bended the rules a bit to support subscriptions :)
I agree with @xlc that given current way we use the protocol (request-response HTTP especially) it's best to keep it stateless and within the bounds of what's currently possible and what users are used to.
Most helpful comment
@xlc I'd keep only one method long term. I think the best option would be to:
state_getKeysPaged(prefix: StorageKey, count: u32, start_key: Option<StorageKey>, hash: Option<Hash>), reasoning for parameters order:countshould be mandatory and we should return an error on big values (say above 10k),hashshould stay as the last argument to be aligned with other methods.state_getNextKeyto avoid endorsing iteration one-by-one. The same can be done via newstate_getKeysPagedif really needed.state_getKeys, implement it using the new method and keep for couple of versions, then remove entirely.I'm not sold on
state_getKeysPaged, but don't have a better idea for now. After we removestate_getKeyswe can probably go back to the old name.