Cosmos-sdk: gaiad: opens many files while synching and can exceed standard Linux limits

Created on 27 Jun 2018  路  7Comments  路  Source: cosmos/cosmos-sdk

I set up a new node (4GB ram, 4cpu, 80gb disk DO). It syncs with gaia-6002. I ran it from block 0 until around 410,000 and it crashed with too many open files. 46657 and 46658 are blocked so it's not an RPC issue.

It seems that the database opens too many files.

I[06-26|22:21:11.469] Executed block                               module=state height=417269 validTxs=0 invalidTxs=0
I[06-26|22:21:11.683] Committed state                              module=state height=417269 txs=0 appHash=C1A2404F6318651FD9C4452B473EC2DDC8613DFD
I[06-26|22:21:11.683] Recheck txs                                  module=mempool numtxs=1 height=417269
I[06-26|22:21:11.684] Done rechecking txs                          module=mempool
I[06-26|22:21:11.692] Indexed block                                module=txindex height=417269
I[06-26|22:21:11.701] Stopping MConnection                         module=p2p peer=91.126.244.116:52306 impl=MConn{91.126.244.116:52306}
E[06-26|22:21:11.702] Stopping peer for error                      module=p2p peer="Peer{MConn{91.126.244.116:52306} 26d255c6901d1ba6dd230865c18b250d3a6d4984 in}" err="Error{`recovered panic in MConnection` (cause: open /root/.gaiad/data/blockstore.db/048960.ldb: too many open files)}"
I[06-26|22:21:11.702] Stopping Peer                                module=p2p peer=91.126.244.116:52306 impl="Peer{MConn{91.126.244.116:52306} 26d255c6901d1ba6dd230865c18b250d3a6d4984 in}"
panic: open /root/.gaiad/data/gaia.db/971996.ldb: too many open files

goroutine 58 [running]:
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tmlibs/db.(*GoLevelDB).Get(0xc42000e390, 0xc469ec0800, 0x37, 0x40, 0x37, 0xc469ec0800, 0xd)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tmlibs/db/go_level_db.go:50 +0x138
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tmlibs/db.(*prefixDB).Get(0xc420c9ffb0, 0xc45b31ba10, 0x2a, 0x30, 0x0, 0x0, 0x0)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tmlibs/db/prefix_db.go:60 +0x190
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl.(*nodeDB).GetNode(0xc4209401e0, 0xc462281d00, 0x14, 0x14, 0x0)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl/nodedb.go:77 +0x19d
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl.(*Node).getRightNode(0xc43bbaa4d0, 0xc471ad86e0, 0x20)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl/node.go:441 +0x73
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl.(*Node).get(0xc43bbaa4d0, 0xc471ad86e0, 0xc427e66bc0, 0x1d, 0x20, 0x1d, 0x1, 0x14, 0x14)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl/node.go:163 +0x20f
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl.(*Node).get(0xc427e45ad0, 0xc471ad86e0, 0xc427e66bc0, 0x1d, 0x20, 0x1d, 0x1, 0x14, 0x14)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl/node.go:164 +0x24d
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl.(*Node).get(0xc4398962c0, 0xc471ad86e0, 0xc427e66bc0, 0x1d, 0x20, 0x1d, 0x1, 0x14, 0x20)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl/node.go:164 +0x24d
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl.(*Node).get(0xc439896210, 0xc471ad86e0, 0xc427e66bc0, 0x1d, 0x20, 0x1d, 0x1, 0x14, 0x20)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl/node.go:164 +0x24d
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl.(*Node).get(0xc447104160, 0xc471ad86e0, 0xc427e66bc0, 0x1d, 0x20, 0x1d, 0xffffffffffffffff, 0x14, 0x20)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl/node.go:164 +0x24d
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl.(*Node).get(0xc447104000, 0xc471ad86e0, 0xc427e66bc0, 0x1d, 0x20, 0x1d, 0xffffffffffffffff, 0x14, 0x20)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl/node.go:161 +0x1b9
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl.(*Node).get(0xc439f73ef0, 0xc471ad86e0, 0xc427e66bc0, 0x1d, 0x20, 0x1d, 0xffffffffffffffff, 0x14, 0x20)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl/node.go:161 +0x1b9
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl.(*Node).get(0xc439f73ce0, 0xc471ad86e0, 0xc427e66bc0, 0x1d, 0x20, 0x1d, 0xffffffffffffffff, 0x14, 0x20)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl/node.go:161 +0x1b9
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl.(*Node).get(0xc439f73c30, 0xc471ad86e0, 0xc427e66bc0, 0x1d, 0x20, 0x1d, 0xffffffffffffffff, 0xc4397d0188, 0x5070f8)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl/node.go:161 +0x1b9
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl.(*Node).get(0xc439f73b80, 0xc471ad86e0, 0xc427e66bc0, 0x1d, 0x20, 0xc28520, 0x411e49, 0x411e49, 0xc475613f40)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl/node.go:161 +0x1b9
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl.(*Tree).Get64(0xc471ad86e0, 0xc427e66bc0, 0x1d, 0x20, 0xc42f52d080, 0x0, 0xc42005c800, 0xc42005c800)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl/tree.go:131 +0x5a
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl.(*Tree).Get(0xc471ad86e0, 0xc427e66bc0, 0x1d, 0x20, 0xc400000008, 0xc441f90680, 0xc427e66bdd, 0xc4397d0278)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl/tree.go:123 +0x49
github.com/cosmos/cosmos-sdk/store.(*iavlStore).Get(0xc42ce811b0, 0xc427e66bc0, 0x1d, 0x20, 0x139afe0, 0xc475613d00, 0xc427e66bc0)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/store/iavlstore.go:104 +0x52
github.com/cosmos/cosmos-sdk/store.(*cacheKVStore).Get(0xc475613d80, 0xc427e66bc0, 0x1d, 0x20, 0x0, 0x0, 0x0)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/store/cachekvstore.go:50 +0x14c
github.com/cosmos/cosmos-sdk/store.(*gasKVStore).Get(0xc475613f40, 0xc427e66bc0, 0x1d, 0x20, 0xc427e66bc0, 0x1d, 0x20)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/store/gaskvstore.go:42 +0x8a
github.com/cosmos/cosmos-sdk/x/slashing.Keeper.getValidatorSigningBitArray(0xed86a0, 0xc42109fee0, 0xc420088900, 0xee28e0, 0xc420940050, 0xa, 0xedee80, 0xc42865ec00, 0xc4364ca040, 0x9, ...)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/x/slashing/signing_info.go:33 +0x105
github.com/cosmos/cosmos-sdk/x/slashing.Keeper.handleValidatorSignature(0xed86a0, 0xc42109fee0, 0xc420088900, 0xee28e0, 0xc420940050, 0xa, 0xedee80, 0xc42865ec00, 0xc4364ca040, 0x9, ...)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/x/slashing/keeper.go:70 +0x322
github.com/cosmos/cosmos-sdk/x/slashing.BeginBlocker(0xedee80, 0xc42865ec00, 0xc4364ca040, 0x9, 0xc427e669c0, 0x14, 0x20, 0xc440423a60, 0x9, 0x65df6, ...)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/x/slashing/tick.go:40 +0x5a2
github.com/cosmos/cosmos-sdk/cmd/gaia/app.(*GaiaApp).BeginBlocker(0xc4209137c0, 0xedee80, 0xc42865ec00, 0xc4364ca040, 0x9, 0xc427e669c0, 0x14, 0x20, 0xc440423a60, 0x9, ...)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/cmd/gaia/app/app.go:117 +0xc3
github.com/cosmos/cosmos-sdk/cmd/gaia/app.(*GaiaApp).BeginBlocker-fm(0xedee80, 0xc42865ec00, 0xc4364ca040, 0x9, 0xc427e669c0, 0x14, 0x20, 0xc440423a60, 0x9, 0x65df6, ...)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/cmd/gaia/app/app.go:90 +0xa0
github.com/cosmos/cosmos-sdk/baseapp.(*BaseApp).BeginBlock(0xc4200fc0e0, 0xc427e669c0, 0x14, 0x20, 0xc440423a60, 0x9, 0x65df6, 0x5b2fa2d4, 0x0, 0x68f4, ...)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/baseapp/baseapp.go:386 +0x157
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/abci/client.(*localClient).BeginBlockSync(0xc436c449c0, 0xc427e669c0, 0x14, 0x20, 0xc440423a60, 0x9, 0x65df6, 0x5b2fa2d4, 0x0, 0x68f4, ...)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/abci/client/local_client.go:206 +0xab
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tendermint/proxy.(*appConnConsensus).BeginBlockSync(0xc460803950, 0xc427e669c0, 0x14, 0x20, 0xc440423a60, 0x9, 0x65df6, 0x5b2fa2d4, 0x0, 0x68f4, ...)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tendermint/proxy/app_conn.go:69 +0x78
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tendermint/state.execBlockOnProxyApp(0xedf580, 0xc47507bd20, 0xee23a0, 0xc460803950, 0xc4858c2370, 0xc470e45680, 0xee5a00, 0xc462f2d990, 0x1, 0xc45fe50860, ...)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tendermint/state/execution.go:190 +0x547
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tendermint/state.(*BlockExecutor).ApplyBlock(0xc436c45200, 0xc45430ec20, 0x9, 0x65df5, 0x68f4, 0xc45fe51d20, 0x14, 0x20, 0x1, 0xc45fe50860, ...)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tendermint/state/execution.go:76 +0x12f
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tendermint/blockchain.(*BlockchainReactor).poolRoutine(0xc43fe8b800)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tendermint/blockchain/reactor.go:300 +0x426
created by github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tendermint/blockchain.(*BlockchainReactor).OnStart
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tendermint/blockchain/reactor.go:117 +0x86

I think we need to determine a way to limit the number of open files on the system.
Currently any user would have to increase the number of open files according to this guide: https://www.tecmint.com/increase-set-open-file-limits-in-linux/

bug

All 7 comments

More descriptive issue title

I fixed this problem editing the file /etc/security/limits.conf and put this (at the end of the file):
*user* hard nofile 500000
*user* soft nofile 450000
maybe is more than I need, but solve the problem
This link explain the problem:
https://access.redhat.com/solutions/61334

I was looking https://godoc.org/github.com/syndtr/goleveldb/leveldb/opt#Options and it seems like each leveldb isntance we open might open as many as 500 cache files. This could explain why we are seeing so many open file.

Please may want to be able specify higher levels of caching for validators etc and then less caching for standard full nodes

This should go in a Gaia config file (https://github.com/cosmos/cosmos-sdk/issues/1662).

Closing, believed to have been fixed upstream, please reopen if this can be replicated.

Was this page helpful?
0 / 5 - 0 ratings