the block hash is unique, can get block number by block hash.
var blockHashPrefix = []byte("H") // blockHashPrefix + hash -> num (uint64 big endian)
blockHashPrefix + hash -> num
This also means that block hash are treated as unique value. then, why use the block number as a part of level database key ? thanks.
var (
headerPrefix = []byte("h") // headerPrefix + num (uint64 big endian) + hash -> header
tdSuffix = []byte("t") // headerPrefix + num (uint64 big endian) + hash + tdSuffix -> td
numSuffix = []byte("n") // headerPrefix + num (uint64 big endian) + numSuffix -> hash
blockHashPrefix = []byte("H") // blockHashPrefix + hash -> num (uint64 big endian)
bodyPrefix = []byte("b") // bodyPrefix + num (uint64 big endian) + hash -> block body
blockReceiptsPrefix = []byte("r") // blockReceiptsPrefix + num (uint64 big endian) + hash -> block receipts
lookupPrefix = []byte("l") // lookupPrefix + hash -> transaction/receipt lookup metadata
preimagePrefix = "secure-key-" // preimagePrefix + hash -> preimage
)
Often you want to access subsequent blocks (e.g. N-1, N, N+1). Originally we used only the hashes in the index, but leveldb sorts the database by key, so looking up blocks N-1, N, N+1 entailed pulling data from 3 random locations from disk (i.e. sorted hashes). Furthermore leveldb doesn't store keys 1-by-1, rather it groups keys into blocks of 4K data and stored that.
By prefixing the keys with the block number, the data gets stored close by to each other, often in the same database block, so reading one of them already prefeches it's close neighbors, so if we are answering range queries, the data will already be in memory for the next query.
It also helps with database compaction, since inserting new blocks gets appended to the "end" of a block range, not in the middle, so inserting 1000 new blocks stored by number will make a few database storage slots dirty, whereas inserting the same 1000 blocks by hash would make 1000 (4KB each) database slots dirty.
@karalabe thanks a lot! i found document support your answer. see here
Note that the unit of disk transfer and caching is a block. Adjacent keys (according to the database sort order) will usually be placed in the same block. Therefore the application can improve its performance by placing keys that are accessed together near each other and placing infrequently used keys in a separate region of the key space.
For example, suppose we are implementing a simple file system on top of leveldb. The types of entries we might wish to store are:
filename -> permission-bits, length, list of file_block_ids
file_block_id -> data
We might want to prefix filename keys with one letter (say '/') and the file_block_id keys with a different letter (say '0') so that scans over just the metadata do not force us to fetch and cache bulky file contents.
Most helpful comment
Often you want to access subsequent blocks (e.g.
N-1,N,N+1). Originally we used only the hashes in the index, but leveldb sorts the database by key, so looking up blocksN-1,N,N+1entailed pulling data from 3 random locations from disk (i.e. sorted hashes). Furthermore leveldb doesn't store keys 1-by-1, rather it groups keys into blocks of 4K data and stored that.By prefixing the keys with the block number, the data gets stored close by to each other, often in the same database block, so reading one of them already prefeches it's close neighbors, so if we are answering range queries, the data will already be in memory for the next query.