My Parity instance run with --tracing on --tx-queue-size=20000
I'd like watch on transactions in mempool, I can do this with eth_newPendingTransactionFilter, but there is no filter for removed transactions, so I decide call parity_allTransactions in loop (parity_pendingTransactions is very slow).
My problem is, that parity_allTransactions is slow too, if I run Parity with default tx queue size (8192) -- call usually takes ~0.2-0.3s, but with mempool 20k call takes up to 1.5s. Is there any ways how improve performance? Or maybe right way for observing on transactions in mempool? :)
Command for testing:
$ time curl -s -H "Content-Type: application/json" -X POST --data '{"jsonrpc":"2.0","method":"parity_allTransactions","params":[],"id":0}' http://localhost:8545 | jq .result | jq length
18860
real 0m1.456s
user 0m1.269s
sys 0m0.137s
Thank you!
Can you tell precisely what you wish to achieve?
I'd like watch on transactions in mempool
What do you mean exactly?
I'm working on service which provide API to ethereum wallets and for notifying users about new transactions (or removed transactions) I need watch on mempool. I can proceed with default mempool size (8192), but from other side I'd like have bigger mempool in Parity, because it's big mess when our Parity node kick tx from mempool and etherscan still show that this tx exists.
Initially I used eth_newPendingTransactionFilter, but switched to parity_allTransactions because want hold service mempool in actual state.
What you wish to do sounds like an edge case.
parity_allTransactions taking time with a 20K pool is probably expected.
and etherscan still show that this tx exists
This can and will happen even if the time to query the mempool is reduced.
@tomusdrw can you confirm that this order of performance is expected?
Unfortunately it's super heavy to serialize huge pools of transactions, so the performance will most likely state like that. We could introduce parity_allTransactionsHashes or sth, so that you can monitor hashes only.
If you only care for incoming transactions use eth_subscribe subscription, ufortunately the spec doesn't cover the whole lifetime of a transaction, just importing.
That said it would be fairly easy to implement parity_watchPendingTransactions subscription, which would emit events for every part of transaction lifetime (we already have pool listener that has that info).
TL;DR;
parity_allTransactionHashes RPC methodparity_watchPendingTransactions Pub-SubThis can and will happen even if the time to query the mempool is reduced.
Yes, but if I increase pool size, probability that my service will not have transactions, while etherscan have will be lower. But I do not think that I switch to big mempools in current situation.
@tomusdrw maybe I do not understand something, but is not it serialization should be fast? 20k performance degradation compare to 8k is not linear (instead 2.5x, degradation is about 5x)
I checked code a little and had seen locks on pool, can this be a reason? (sorry, I have zero experience with Rust and probably my assumption is stupid)
parity_allTransactionHashes / parity_watchPendingTransactions is interesting and it would be great to try them. If you need help with testing I'll glad to help.
I understand that this is not comparable to RPC requests to Parity, but node.js performance on file with 20k transactions (24MB): parse ~140ms, serialization ~100ms.
@fanatid The performance of all is ~ n * log (n) (for every sender's transactions we do bin-search for correct transaction), so non-linear increase is expected.
Also the "serialization" in our case does couple of additional things (like compute the contract address, which is keccak256) so it's a bit heavier than just from-to-json (especially in JS, which is super fast for this)
While the performance could be improved (a bit) to make it linear, I think it would still be a bit too slow for the stuff that you want and other RPC methods would suit you better.
@tomusdrw thank you for clear comment, this explains a lot. Any plans about parity_allTransactionHashes / parity_watchPendingTransactions as Parity feature? :)
In any case, I think we can close this issue, it's clear now why parity_allTransactions is slow.
@Tbaut maybe let's just rename that issue and make it a request to implement these RPCs? @seunlanlege something you might be interested in?
@tomusdrw definitely
Probably now time for parity_watchPendingTransactions :laughing:
@tomusdrw can you clarify, is this should be implemented as pubsub or as filter? (I had never used pubsub with geth/parity and have no idea how they works). As I understand from code, when you said that parity already have listener for every part of transaction litetime you meant this? https://github.com/paritytech/parity-ethereum/blob/f8f8bf0feae1da017cd7d4864f5649d4f7f20ed4/parity/rpc_apis.rs#L309
@fanatid yes, a pub-sub would be cool. The listener I meant is here:
https://github.com/paritytech/parity-ethereum/blob/f8f8bf0feae1da017cd7d4864f5649d4f7f20ed4/miner/src/pool/listener.rs#L61
(below you can find an implementation for logger that has all cases handled)
But the code you've linked is binding one specific kind of notifications with the rest of the code. With pending transactions we'll need something similar but I think the pubsub should subscribe directly to the miner/pool stream of notifications (instead of working with callbacks like here).
The event structure could be somewhat like this (HashMap<TxHash, Status>):
{
"jsonrpc": "2.0",
"result": {
"0x..1": "added",
"0x..2": "dropped",
}
}
I think emitting hashes should be sufficient, emitting more might generate too much overhead, with hashes it's easy to fetch the whole transaction with another RPC.
@tomusdrw I think I figured out now how eth_subscribe works for newPendingTransactions. I also checked parity_subscribe and found that method called every 1000ms.
So, question regarding parity_watchPendingTransactions.. it's not fully clear for me right now, how this should work? Should we make call like:
{ jsonrpc: '2.0', method: 'parity_watchTransactionPool', params: [], id: 1 }
and sent notifications like:
{ jsonrpc: '2.0', method: 'parity_watchTransactionPool', 'params': { hash: '0x...', action: 'added/dropped/etc' }
or we should do this in some other way?
@fanatid parity_subscribe is just a generalized way to subscribe to any RPC method. As you've noticed it's basically polling under the hood (but not client side, but rather server side).
The proper way to do it is to have a similar implementation to eth_subscribe i.e. some component in miner should expose a Stream of transaction pool events, and that stream should be pushed to the client. The stream should be populated via events in transaction pool listener.
I think it's done now with #10558
Most helpful comment
@Tbaut maybe let's just rename that issue and make it a request to implement these RPCs? @seunlanlege something you might be interested in?