Cosmos-sdk: Combining data fields with multiple messages is unusable

Created on 9 Jun 2020  Â·  6Comments  Â·  Source: cosmos/cosmos-sdk

Summary of Bug

Currently, if there are multiple messages in one tx and some return a data section, it is impossible to determine which msg returned this.

We see a few places that it is important that messages length-prefix the result.Data:

However, the majority of messages simply return Data: nil, for example x/bank messages. In fact, a quick search did not find any modules that do return non-nil.

In wasmd, we use Data to return the address of a newly created contract. While working on a client parser, we realized it is impossible to correlate the data with the mesage if there is another message with no data in the same tx. eg. [MsgSend, MsgInstantiateContract] and [MsgInstantiateContract, MsgSend] would return the exact same data blob - appending a nil doesn't have any impact.

Version

At least 0.38.x and master (and probably everything for a long time)

Proposal to fix

One approach is to demand that all message handlers return a length-prefixed array, which requires they return eg. []byte{0} instead of nil everywhere. However, I think this is a very hard migration now.

I think the simplest is to do just what is done for combining logs and add the prefix in runMsg.

If we decide in fixed-length two byte prefixes (simple and let's us use more than 256 byte data blobs), we could replace data = append(data, msgResult.Data...) with something like:

    l := len(data)
    buf := make([]byte, l+2)
    binary.BigEndian.PutUint16(buf, uint16(l))
    if l > 0 {
        copy(buf[2:], data)
    }
    data = append(data, buf...)

This will ensure that all fields (even nil) are properly length-prefixed. I doubt many people are properly handling their data blobs now so this shouldn't be too breaking, but it can easily be added to 0.39 (but not 0.38.x)


For Admin Use

  • [ ] Not duplicate issue
  • [ ] Appropriate labels applied
  • [ ] Appropriate contributors tagged
  • [ ] Contributor assigned/self-assigned
baseapp dev-ux

Most helpful comment

&sdk.MsgData{MsgType: msg.Type(), Data: msgResult.Data}

Ah, now I get it. This stores the original type of the message to help make it clearer which message causes which. Kind of like how you use the uint16(i) in the next list.

Fair enough, I am happy with more info on the client side. Not sure if this is extra state bloat, but I'll let someone else argue that.

It is easy to parse and captures all info we need. :+1:

All 6 comments

@alexanderbez could you take a look at this? seems like an easy fix, but I would like your opinion if I am missing something

Totally agree here in the poor UX @ethanfrey. I think the proposal also makes sense -- you're just ensuring even nil values get length-prefixed, correct?

Since we're going to change the encoding of data here, why not just make data a proto encoding of a richer type:

message MessageData {
   string msg_type = 1;
   bytes data = 2;
}

message TxData {
  repeated MessageData data = 1;
}

Then data is just an encoded TxData. This might be overkill or not provide any additional context, so I'm ok with either solution.

Would you like to take this on or should I?

I'd be happy for someone else to take this on.

Any solution is fine with me. But the least-breaking one would just keep Data []byte in the sdk.Result and in the abci.Result. These are opaque bytes, so any way of combining them into one is fine. My bikeshedded 2-byte prefix is one example. I am fine using protobuf as well, maybe that is easier. But if so, a simpler way is to just combine them.

message TxData {
  repeated bytes data = 1;
}

We can then set [][]byte in TxData and serialize it to []byte (and later deserialize). Adding string types is a different change and affects sdk.Result as well, so I would make another issue/pr for that one.

I'm happy to review if you take either the simple byte prefix, or the protobuf repeated encoding

I think the proposal also makes sense -- you're just ensuring even nil values get length-prefixed, correct?

That is the main point, ensure everything is length-prefixed. I just move length-prefixing into baseapp so no handler can forget to do it.

But the least-breaking one would just keep Data []byte in the sdk.Result and in the abci.Result.

Right, this doesn't change in my proposal. The types are still the same, just the representation changes.

diff --git a/baseapp/baseapp.go b/baseapp/baseapp.go
index 83d32a0b5..e8805cbe2 100644
--- a/baseapp/baseapp.go
+++ b/baseapp/baseapp.go
@@ -6,6 +6,7 @@ import (
        "strings"

        "github.com/gogo/protobuf/grpc"
+       "github.com/golang/protobuf/proto"

        abci "github.com/tendermint/tendermint/abci/types"
        "github.com/tendermint/tendermint/crypto/tmhash"
@@ -605,8 +606,10 @@ func (app *BaseApp) runTx(mode runTxMode, txBytes []byte, tx sdk.Tx) (gInfo sdk.
 // Result is returned. The caller must not commit state if an error is returned.
 func (app *BaseApp) runMsgs(ctx sdk.Context, msgs []sdk.Msg, mode runTxMode) (*sdk.Result, error) {
        msgLogs := make(sdk.ABCIMessageLogs, 0, len(msgs))
-       data := make([]byte, 0, len(msgs))
        events := sdk.EmptyEvents()
+       txData := &sdk.TxData{
+               Data: make([]*sdk.MsgData, len(msgs)),
+       }

        // NOTE: GasWanted is determined by the AnteHandler and GasUsed by the GasMeter.
        for i, msg := range msgs {
@@ -638,10 +641,15 @@ func (app *BaseApp) runMsgs(ctx sdk.Context, msgs []sdk.Msg, mode runTxMode) (*s
                // separate each result.
                events = events.AppendEvents(msgEvents)

-               data = append(data, msgResult.Data...)
+               txData.Data[i] = &sdk.MsgData{MsgType: msg.Type(), Data: msgResult.Data}
                msgLogs = append(msgLogs, sdk.NewABCIMessageLog(uint16(i), msgResult.Log, msgEvents))
        }

+       data, err := proto.Marshal(txData)
+       if err != nil {
+               return nil, sdkerrors.Wrap(err, "failed to marshal tx data")
+       }
+
        return &sdk.Result{
                Data:   data,
                Log:    strings.TrimSpace(msgLogs.String()),

WDYT?

&sdk.MsgData{MsgType: msg.Type(), Data: msgResult.Data}

Ah, now I get it. This stores the original type of the message to help make it clearer which message causes which. Kind of like how you use the uint16(i) in the next list.

Fair enough, I am happy with more info on the client side. Not sure if this is extra state bloat, but I'll let someone else argue that.

It is easy to parse and captures all info we need. :+1:

Was this page helpful?
0 / 5 - 0 ratings

Related issues

fedekunze picture fedekunze  Â·  3Comments

johnmcdowall picture johnmcdowall  Â·  3Comments

kevlubkcm picture kevlubkcm  Â·  3Comments

fedekunze picture fedekunze  Â·  3Comments

rigelrozanski picture rigelrozanski  Â·  3Comments