We should write a new tool, call it tendermint-debug, with some options for collecting debug data for tendermint.
At least two uses come to mind:
tendermint-debug kill takes a snapshot of the latest state by fetching info from /status, /net_info, and /dump_consensus_state, and running kill -6 and catching the go-routine dump. This will make it easy for users to collect all this information with a single command when they run into an issue with their node or network. This is the set of info we're always asking for anyways. The command can collect all this, along with the config and the WAL, and package it in a zip file to make it easy for users to submit.
tendermint-debug dump --frequency=30s will fetch various debug info every eg. 30s and collect it in one place so we can see how it evolves over time. This should also include information from the pprof server like the go-routine stacks and heap profile. This replaces #3117
Would like to tackle this if you don't mind @ebuchman
Should this new tool be included in tendermint binary (cmd/tendermint) - tendermint debug? I think it should. Otherwise users will be forced to install yet another binary to debug.
Keep note that we should make consensus reactor data more accessible in dumps: https://github.com/tendermint/tendermint/issues/3302#issuecomment-462743194
Most helpful comment
Should this new tool be included in tendermint binary (cmd/tendermint) -
tendermint debug? I think it should. Otherwise users will be forced to install yet another binary to debug.