Node panics and shuts down when the cosmos.tx.v1beta1.Msg/TxsByEvents method is queried over grpc without pagination parameters.
Similarly, querying grpc-gateway against /cosmos/tx/v1beta1/txs endpoint without pagination parameters handles the panic, but still returns a nil-pointer error to clients.
master (reproduced with 1b3ce2c)
make localnet-start
curl "localhost:1317/cosmos/tx/v1beta1/txs?event=message.action%3Dsend"
Error:
{
"jsonrpc": "2.0",
"id": -1,
"error": {
"code": -32603,
"message": "Internal error",
"data": "runtime error: invalid memory address or nil pointer dereference"
}
}%
Running the same query with pagination flags gives a correct response:
curl "localhost:1317/cosmos/tx/v1beta1/txs?event=message.action%3Dsend&pagination.offset=0&pagination.limit=1"
I also was able to reproduce with grpcurl, verifying that this is not a bug in grpc gateway.
grpcurl -d '{"event": "message.action=send"}' -plaintext -proto proto/cosmos/tx/v1beta1/service.proto -import-path proto -import-path third_party/proto localhost:9090 cosmos.tx.v1beta1.Service.GetTxsEvent
causes the node to panic and shut down, with the following stack trace:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x4ea00be]
goroutine 473 [running]:
github.com/cosmos/cosmos-sdk/x/auth/tx.txServer.GetTxsEvent(0x0, 0x0, 0x0, 0x5b57dc0, 0xc0004fb240, 0x0, 0x0, 0x5b37080, 0xc000e800c0, 0x5b42fa0, ...)
github.com/cosmos/cosmos-sdk/x/auth/tx/service.go:52 +0x3e
github.com/cosmos/cosmos-sdk/types/tx._Service_GetTxsEvent_Handler.func1(0x5b36f40, 0xc0003aaed0, 0x5657f80, 0xc000578680, 0x0, 0xc0000cf200, 0x5b36f40, 0xc0003aaed0)
github.com/cosmos/cosmos-sdk/types/tx/service.pb.go:542 +0x89
github.com/cosmos/cosmos-sdk/baseapp.(*BaseApp).RegisterGRPCServer.func1(0x5b36f40, 0xc0003aaed0, 0x5657f80, 0xc000578680, 0xc0005786a0, 0xc0005786c0, 0xc000061200, 0x0, 0x58, 0x6c7e7d0)
github.com/cosmos/cosmos-sdk/baseapp/grpcserver.go:63 +0x365
github.com/cosmos/cosmos-sdk/types/tx._Service_GetTxsEvent_Handler(0x55ef0a0, 0xc0000e78c0, 0x5b36f40, 0xc0001777a0, 0xc000e64e40, 0xc000e80270, 0x55ed4a0, 0xc0001777a0, 0x54e3c20, 0x66d35e8)
github.com/cosmos/cosmos-sdk/types/tx/service.pb.go:544 +0x150
github.com/cosmos/cosmos-sdk/baseapp.(*BaseApp).RegisterGRPCServer.func2(0x55ef0a0, 0xc0000e78c0, 0x5b36f40, 0xc0001777a0, 0xc000e64e40, 0x0, 0x5b36f40, 0xc0001777a0, 0xc000153200, 0x16)
github.com/cosmos/cosmos-sdk/baseapp/grpcserver.go:77 +0x67
google.golang.org/grpc.(*Server).processUnaryRPC(0xc000e7ac40, 0x5b48b20, 0xc000d1cf00, 0xc000c32200, 0xc00002b020, 0xc0010fa3a0, 0x0, 0x0, 0x0)
google.golang.org/[email protected]/server.go:1210 +0x522
google.golang.org/grpc.(*Server).handleStream(0xc000e7ac40, 0x5b48b20, 0xc000d1cf00, 0xc000c32200, 0x0)
google.golang.org/[email protected]/server.go:1533 +0xd05
google.golang.org/grpc.(*Server).serveStreams.func1.2(0xc000fe2220, 0xc000e7ac40, 0x5b48b20, 0xc000d1cf00, 0xc000c32200)
google.golang.org/[email protected]/server.go:871 +0xa5
created by google.golang.org/grpc.(*Server).serveStreams.func1
google.golang.org/[email protected]/server.go:869 +0x1fd
Meanwhile, the following command returns successfully:
grpcurl -d '{"pagination": {"offset": 0, "limit": 1}, "event": "message.action=send"}' -plaintext -proto proto/cosmos/tx/v1beta1/service.proto -import-path proto -import-path third_party/proto localhost:9090 cosmos.tx.v1beta1.Service.GetTxsEvent
Found while investingating this behiavor for #7281
cc @aleem1314 @anilCSE
As discussed during a Regen internal call, we thought it would be a good idea to add a panic handler on all queries: https://github.com/cosmos/cosmos-sdk/issues/8038
This is a sign that the recent (gRPC, etc...) client-facing code wasn't thoroughly tested enough when merging to master.
This is also a sign that panics are bad. Moreover the panic output is not user friendly and always requires to be caught in a place where we don't have much context.
Yes, but it seems @amaurymartiny is mainly concerned about non-explicit panics (e.g. improperly handled pagination). I would be very concerned if we had _explicit_ panics in client-facing code -- those errors should be bubbled upstream as much as possible.
Most helpful comment
Yes, but it seems @amaurymartiny is mainly concerned about non-explicit
panics (e.g. improperly handled pagination). I would be very concerned if we had _explicit_panics in client-facing code -- those errors should be bubbled upstream as much as possible.