Go-ethereum: Block processing time slowdown following trie persistence

Created on 4 May 2018 · 5Comments · Source: ethereum/go-ethereum

System information

Geth version: Geth/v1.8.3
OS & Version: Ubuntu 16.04

My block processing time is typically on the order of 250ms, but I have noticed that there are occasional extended periods (can last more than an hour) during which this time increase drastically, sometimes averaging between 2-6 seconds on my machine. This almost always happens immediately following a log message indicating that the state trie in memory has been persisted to disk.

If I understand it correctly, this makes sense as this part of the state trie is no longer located in RAM, and thus slower to retrieve when verifying a block. Is there a way to avoid this slowdown in processing time? Could this be avoided by always making sure that the cache contains a substantial portion of the recent state, and not purging too much of the it when its persisted to disk?

Source

maxgillett

👍1

All 5 comments

The extra overhead may introduced by the leveldb compaction.

Geth will persist a part of state data to the disk following some rules. In other words, geth will accumulate many state data generated in the past several blocks in memory, and batch write a part at a certain moment.

When the size of whole database become large, leveldb compaction will appear more and more frequently. And when the compaction burden is heavy, normal database writes will be blocked, which will result in a longer time for the block process.

rjl493456442 on 4 May 2018

❤1

OP's intuition is most probably correct on this one. When we flush the cache on mainnet, we push out about 256MB worth of trie data to disk. However, probably a lot of that will be read back in for the next blocks.

A good optimization would be to have some form of LRU cache integrated and avoid flushing out everything, rather keep the recently accessed ones in. It's not a trivial thing to implement though, as flushing the data destroys the internal reference counters used by the garbage collector.

@rjl493456442 You are also right that compaction might influence it, but if we were to keep some of the flushed data in memory, then compaction would have less of an impact.

karalabe on 4 May 2018

❤1

Wonder why I am here, sorry for bother, went too far from real life.
Downgrade. Thank you all, I am leaving net for real jobs. Tired and lost.
Sencirely Serik.

On Fri, 4 May 2018 14:28 Péter Szilágyi, notifications@github.com wrote:

OP's intuition is most probably correct on this one. When we flush the
cache on mainnet, we push out about 256MB worth of trie data to disk.
However, probably a lot of that will be read back in for the next blocks.

A good optimization would be to have some form of LRU cache integrated and
avoid flushing out everything, rather keep the recently accessed ones in.
It's not a trivial thing to implement though, as flushing the data destroys
the internal reference counters used by the garbage collector.

@rjl493456442 https://github.com/rjl493456442 You are also right that
compaction might influence it, but if we were to keep some of the flushed
data in memory, then compaction would have less of an impact.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/ethereum/go-ethereum/issues/16674#issuecomment-386536123,
or mute the thread
https://github.com/notifications/unsubscribe-auth/Ac3rNfPH3J3_Aq5v4-rYJeVbWjD7d4F5ks5tvBEigaJpZM4TyD7r
.

ghost on 4 May 2018

😕1

I can look at some options for implementing this.

@rjl493456442 Is there a way to manually turn compaction on and off? I only see API methods to initiate a compaction.

maxgillett on 5 May 2018

@maxgillett Unfortunately, i think it's difficult for us to adjust the leveldb's compaction strategy to avoid the overhead.
The leveldb for compaction is due to the following reasons：

Remove redundant data. Since leveldb is a typical LSM Tree implementation, so it will save all versions for an entry. To avoid disk waste, it will clean redundant data during the compaction.
Balance reading and writing speed difference. For leveldb, writing a data entry is really fast, it only evolve one O(log(n)) memory insert operation and ordered file writing. While for reading operation, the cost is much higher, especially when the speed of writing and the amount of data are large. So leveldb will balance these two operations by merging the level0 files and
slowing down or even pausing write operation according the situation of compaction.

For the compaction trigger, although you can change the compaction trigger configuration to postpone the compaction, but can not avoid.

Anyway, the overhead of compaction is inevitable for LSM Tree type databases.

rjl493456442 on 6 May 2018

Was this page helpful?

0 / 5 - 0 ratings