Go-ethereum: Why use the block number as a part of level database key

Created on 10 Jan 2018 · 3Comments · Source: ethereum/go-ethereum

the block hash is unique, can get block number by block hash.

var blockHashPrefix     = []byte("H")   // blockHashPrefix + hash -> num (uint64 big endian)

blockHashPrefix + hash -> num

This also means that block hash are treated as unique value. then, why use the block number as a part of level database key ? thanks.

var (
    headerPrefix        = []byte("h")   // headerPrefix + num (uint64 big endian) + hash -> header
    tdSuffix            = []byte("t")   // headerPrefix + num (uint64 big endian) + hash + tdSuffix -> td
    numSuffix           = []byte("n")   // headerPrefix + num (uint64 big endian) + numSuffix -> hash
    blockHashPrefix     = []byte("H")   // blockHashPrefix + hash -> num (uint64 big endian)
    bodyPrefix          = []byte("b")   // bodyPrefix + num (uint64 big endian) + hash -> block body
    blockReceiptsPrefix = []byte("r")   // blockReceiptsPrefix + num (uint64 big endian) + hash -> block receipts
    lookupPrefix        = []byte("l")   // lookupPrefix + hash -> transaction/receipt lookup metadata
    preimagePrefix      = "secure-key-" // preimagePrefix + hash -> preimage
)

Source

ysqi

Most helpful comment

Often you want to access subsequent blocks (e.g. N-1, N, N+1). Originally we used only the hashes in the index, but leveldb sorts the database by key, so looking up blocks N-1, N, N+1 entailed pulling data from 3 random locations from disk (i.e. sorted hashes). Furthermore leveldb doesn't store keys 1-by-1, rather it groups keys into blocks of 4K data and stored that.

By prefixing the keys with the block number, the data gets stored close by to each other, often in the same database block, so reading one of them already prefeches it's close neighbors, so if we are answering range queries, the data will already be in memory for the next query.

karalabe on 10 Jan 2018

👍4

All 3 comments

karalabe on 10 Jan 2018

👍4

It also helps with database compaction, since inserting new blocks gets appended to the "end" of a block range, not in the middle, so inserting 1000 new blocks stored by number will make a few database storage slots dirty, whereas inserting the same 1000 blocks by hash would make 1000 (4KB each) database slots dirty.

karalabe on 10 Jan 2018

@karalabe thanks a lot! i found document support your answer. see here

Key Layout

Note that the unit of disk transfer and caching is a block. Adjacent keys (according to the database sort order) will usually be placed in the same block. Therefore the application can improve its performance by placing keys that are accessed together near each other and placing infrequently used keys in a separate region of the key space.

For example, suppose we are implementing a simple file system on top of leveldb. The types of entries we might wish to store are:

filename -> permission-bits, length, list of file_block_ids
file_block_id -> data

We might want to prefix filename keys with one letter (say '/') and the file_block_id keys with a different letter (say '0') so that scans over just the metadata do not force us to fetch and cache bulky file contents.

ysqi on 11 Jan 2018

Was this page helpful?

0 / 5 - 0 ratings