While testing to send a MsgConnectionOpenInit transaction to a Cosmos SDK chain (v0.40.0-rc0), I've encountered an Internal Error with the RPC HTTP Handler and potentially might cause additional errors.
Relayer (Stargate-4)
Tendermint v0.34.0-rc4
Cosmos SDK v0.40.0-rc0
dev-env scriptpublic_key property of a SignerInfo to None (which I believe might set it to an empty string when decoded in Cosmos). let signer_info = SignerInfo {
public_key: None,
mode_info: mode,
sequence: 0,
};
The protobuf file where SignerInfo is says the public_key is optional so that's why I set it to None. Other frameworks might set it to a null or empty value.
message SignerInfo {
// public_key is the public key of the signer. It is optional for accounts
// that already exist in state. If unset, the verifier can use the required \
// signer address for this position and lookup the public key.
google.protobuf.Any public_key = 1;
// mode_info describes the signing mode of the signer and is a nested
// structure to support nested multisig pubkey's
ModeInfo mode_info = 2;
// sequence is the sequence of the account, which describes the
// number of committed transactions signed by a given address. It is used to
// prevent replay attacks.
uint64 sequence = 3;
}
broadcast_tx_sync to the chain. Probably you can reproduce the same using this command below (the tx hash is for a MsgConnectionOpenInit)curl localhost:26657/broadcast_tx_sync?tx=0x0a9f010a9c010a2d2f6962632e636f72652e636f6e6e656374696f6e2e76312e4d7367436f6e6e656374696f6e4f70656e496e6974126b0a0d6962637a65726f636c69656e74120b6962637a65726f636f6e6e1a1c0a0c6962636f6e65636c69656e74120a6962636f6e65636f6e6e1a002205312e302e302a283235454635364341373935313335453430393336384536444238313130463232413442453035433212080a0612040a0208011a40ac3342993e25da936cddc7be3d8f603ca6e9661518d536d0c482e18a0154aa096e438a6b9bcadfcfc2f0d689dccaf55b96399d67a8361b70f5da13091e2f929b
Internal Error message. Internal error: runtime error: invalid memory address or nil pointer dereference (code: -32603)
E[2020-10-16|16:57:54.538] Panic in RPC HTTP handler module=rpc-server err="runtime error: invalid memory address or nil pointer dereference" stack="goroutine 1086398 [running]:\nruntime/debug.Stack(0x1890fc0, 0x1880780, 0x2941c40)\n\truntime/debug/stack.go:24 +0x9f\ngithub.com/tendermint/tendermint/rpc/jsonrpc/server.RecoverAndLogHandler.func1.2(0xc006312a20, 0x1ebd880, 0xc0028dee80, 0xbfdaa034a00d6792, 0x12b18964a9e4, 0x29e0ae0, 0xc004b94100)\n\tgithub.com/tendermint/[email protected]/rpc/jsonrpc/server/http_server.go:185 +0x175\npanic(0x1880780, 0x2941c40)\n\truntime/panic.go:969 +0x175\ngithub.com/cosmos/cosmos-sdk/codec/types.(*interfaceRegistry).UnpackAny(0xc0000c2200, 0x0, 0x17b65c0, 0xc006727340, 0xc0028e0f60, 0x4113e5)\n\tgithub.com/cosmos/[email protected]/codec/types/interface_registry.go:149 +0x3a\ngithub.com/cosmos/cosmos-sdk/types/tx.(*SignerInfo).UnpackInterfaces(...)\n\tgithub.com/cosmos/[email protected]/types/tx/types.go:163\ngithub.com/cosmos/cosmos-sdk/types/tx.(*AuthInfo).UnpackInterfaces(0xc006312b40, 0x1e822e0, 0xc0000c2200, 0x7f08901a2a38, 0xc006312b40)\n\tgithub.com/cosmos/[email protected]/types/tx/types.go:153 +0xa9\ngithub.com/cosmos/cosmos-sdk/codec/types.UnpackInterfaces(...)\n\tgithub.com/cosmos/[email protected]/codec/types/interface_registry.go:205\ngithub.com/cosmos/cosmos-sdk/codec.(*ProtoCodec).UnmarshalBinaryBare(0xc0000c2210, 0xc006fe3230, 0x8, 0x8, 0x1ecb720, 0xc006312b40, 0x0, 0x0)\n\tgithub.com/cosmos/[email protected]/codec/proto_codec.go:70 +0x114\ngithub.com/cosmos/cosmos-sdk/x/auth/tx.DefaultTxDecoder.func1(0xc000278b40, 0xee, 0x1e0, 0x2d2846dfb9e8e46, 0x78a5636f748f82ee, 0x8cc7020884c87814, 0xbad52375509a6be6)\n\tgithub.com/cosmos/[email protected]/x/auth/tx/decoder.go:48 +0x274\ngithub.com/cosmos/cosmos-sdk/baseapp.(*BaseApp).CheckTx(0xc000583500, 0xc000278b40, 0xee, 0x1e0, 0xc000000000, 0x0, 0x0, 0x0, 0x0, 0x0, ...)\n\tgithub.com/cosmos/[email protected]/baseapp/abci.go:216 +0x197\ngithub.com/tendermint/tendermint/abci/client.(*localClient).CheckTxAsync(0xc0010391a0, 0xc000278b40, 0xee, 0x1e0, 0xc000000000, 0x0)\n\tgithub.com/tendermint/[email protected]/abci/client/local_client.go:98 +0xdc\ngithub.com/tendermint/tendermint/proxy.(*appConnMempool).CheckTxAsync(0xc0005deaf0, 0xc000278b40, 0xee, 0x1e0, 0x0, 0x4133b0)\n\tgithub.com/tendermint/[email protected]/proxy/app_conn.go:126 +0x59\ngithub.com/tendermint/tendermint/mempool.(*CListMempool).CheckTx(0xc0010549c0, 0xc000278b40, 0xee, 0x1e0, 0xc006727310, 0x17d0000, 0x0, 0x0, 0x0, 0x0)\n\tgithub.com/tendermint/[email protected]/mempool/clist_mempool.go:288 +0x477\ngithub.com/tendermint/tendermint/rpc/core.BroadcastTxSync(0xc006312a60, 0xc000278b40, 0xee, 0x1e0, 0x0, 0x0, 0x0)\n\tgithub.com/tendermint/[email protected]/rpc/core/mempool.go:36 +0xd3\nreflect.Value.call(0x185cb40, 0x1cd7498, 0x13, 0x1a8a19b, 0x4, 0xc002e18f00, 0x2, 0x2, 0xc002e18f18, 0xc006312aa0, ...)\n\treflect/value.go:475 +0x8c7\nreflect.Value.Call(0x185cb40, 0x1cd7498, 0x13, 0xc002e18f00, 0x2, 0x2, 0x1, 0x2, 0x1cd9500)\n\treflect/value.go:336 +0xb9\ngithub.com/tendermint/tendermint/rpc/jsonrpc/server.makeHTTPHandler.func2(0x1eb6ac0, 0xc006312a20, 0xc004b94100)\n\tgithub.com/tendermint/[email protected]/rpc/jsonrpc/server/http_uri_handler.go:56 +0x439\nnet/http.HandlerFunc.ServeHTTP(0xc002deacf0, 0x1eb6ac0, 0xc006312a20, 0xc004b94100)\n\tnet/http/server.go:2042 +0x44\nnet/http.(*ServeMux).ServeHTTP(0xc000698d40, 0x1eb6ac0, 0xc006312a20, 0xc004b94100)\n\tnet/http/server.go:2417 +0x1ad\ngithub.com/tendermint/tendermint/rpc/jsonrpc/server.maxBytesHandler.ServeHTTP(0x1e842a0, 0xc000698d40, 0xf4240, 0x1eb6ac0, 0xc006312a20, 0xc004b94100)\n\tgithub.com/tendermint/[email protected]/rpc/jsonrpc/server/http_server.go:234 +0xd4\ngithub.com/tendermint/tendermint/rpc/jsonrpc/server.RecoverAndLogHandler.func1(0x1eb7680, 0xc0006ae000, 0xc004b94100)\n\tgithub.com/tendermint/[email protected]/rpc/jsonrpc/server/http_server.go:207 +0x39a\nnet/http.HandlerFunc.ServeHTTP(0xc002788cc0, 0x1eb7680, 0xc0006ae000, 0xc004b94100)\n\tnet/http/server.go:2042 +0x44\nnet/http.serverHandler.ServeHTTP(0xc00015f500, 0x1eb7680, 0xc0006ae000, 0xc004b94100)\n\tnet/http/server.go:2843 +0xa3\nnet/http.(*conn).serve(0xc003a7d5e0, 0x1ebd340, 0xc004b19f80)\n\tnet/http/server.go:1925 +0x8ad\ncreated by net/http.(*Server).Serve\n\tnet/http/server.go:2969 +0x36c\n"
if any.TypeUrl == "" {
// if TypeUrl is empty return nil because without it we can't actually unpack anything
return nil
}
{
"jsonrpc": "2.0",
"id": -1,
"error": {
"code": -32603,
"message": "Internal error",
"data": "tx already exists in cache"
}
}
My impression is that sending this through the RPC it triggers a panic which is caught by Tendermint's RPC panic handler and so doesn't cause any real problems. However if a malicious proposer included this tx directly in a block I think it would crash all the nodes. I would suspect there's similar such bugs to this lurking (effectively empty proto fields deserializing to nil and not being checked before access) so we should probably try to suss them all out.
Actually the SDK has a panic handler doesnt it that would catch a panic like this and just cause the tx to be invalid, but wouldn't crash anything ?
Yes, the SDK has a recover handler during tx/message execution.
This is happening in TxDecoder. Maybe that doesn't have a panic handler??
So two things to do:
UnpackAny should properly handle the case where *Any is nil gracefullyWe can also address the call to UnpackAny with nil for public key, but the above change to UnpackAny should allow that to be okay generally.
/cc @amaurymartiny @blushi
CheckTx and DeliverTx decode the tx before calling BaseApp#runTx -- I don't think that makes the most sense and the function signature of BaseApp#runTx is weird.
So I propose we move the tx decoding to BaseApp#runTx (after recover block) and we get panic recovery for free.
CheckTxandDeliverTxdecode the tx before callingBaseApp#runTx-- I don't think that makes the most sense and the function signature ofBaseApp#runTxis weird.So I propose we move the tx decoding to
BaseApp#runTx(afterrecoverblock) and we get panic recovery for free.
@alexanderbez The function signature of BaseApp#runTx is indeed weird but it seems like there's a reason for it. In fact, it's used in BaseApp#Check and BaseApp#Deliver methods with decoded tx only.
So if we remove tx sdk.Tx from runTx's arguments, that means that we would need to re-encode it so it then can be decoded again...
That being said, @amaurymartiny noticed BaseApp#Check and BaseApp#Deliver are just used in tests and simulation for now, so maybe that's actually not really an issue.
That being said, @amaurymartiny noticed BaseApp#Check and BaseApp#Deliver are just used in tests and simulation for now, so maybe that's actually not really an issue.
Yes, these are solely used for tests. I don't know how much of a lift it would be, but I think it's the better solution.