Go-ipfs: Interop tests are failing against master

Created on 29 May 2020  Â·  18Comments  Â·  Source: ipfs/go-ipfs

It looks like CI ignores failing interop tests:

https://app.circleci.com/pipelines/github/ipfs/go-ipfs/2908/workflows/0b7a5ebf-cb6c-4960-b19e-fd4c1b61983c/jobs/34278

This doesn't seem great because without passing tests there's no guarantee the code does what it claims to do and can then prevent the release of downstream projects that do not ignore failing interop tests.

kinbug neetriage

All 18 comments

It looks like an issue with the mocha reporter we were using. I'm working on a fix in https://github.com/ipfs/go-ipfs/pull/7395.

Reopening because #7395 doesn't solve the problem and the interop tests are still failing against master.

Following on from the discussion in #7394 running the tests with pubsub logging turned on shows this sort of thing (js is prefixed with libp2p:gossipsub, go with DEBUG and WARN):

Successful:

DEBUG   pubsub  [email protected]/gossipsub.go:304    JOIN pubsub-non-ascii
WARN    pubsub  [email protected]/log.go:180    bootstrap: error providing rendezvous for pubsub-non-ascii: failed to find any peer in table
libp2p:gossipsub rpc from QmQ5mRojZzj5SefRYKjWTxXbdT4SzPdHv38WZjP2yCFyTn
libp2p:gossipsub JOIN [ 'pubsub-non-ascii' ]
libp2p:gossipsub JOIN: Add mesh link to QmQ5mRojZzj5SefRYKjWTxXbdT4SzPdHv38WZjP2yCFyTn in pubsub-non-ascii
DEBUG   pubsub  [email protected]/gossipsub.go:216    GRAFT: Add mesh link from QmSym7KNssvhzV2daVzCNgWttLx2MxBmjW8y7EgcPqrosb in pubsub-non-ascii
Found QmQ5mRojZzj5SefRYKjWTxXbdT4SzPdHv38WZjP2yCFyTn on attempt 1 in 22ms
Found QmSym7KNssvhzV2daVzCNgWttLx2MxBmjW8y7EgcPqrosb on attempt 1 in 3ms
---> sending message
libp2p:gossipsub publish pubsub-non-ascii <Buffer e4 bd a0 e5 a5 bd e4 b8 96 e7 95 8c>

      ✓ should exchange non ascii data (41ms)

Failure:

DEBUG   pubsub  [email protected]/gossipsub.go:304    JOIN pubsub-ascii
WARN    pubsub  [email protected]/log.go:180    bootstrap: error providing rendezvous for pubsub-ascii: failed to find any peer in table
libp2p:gossipsub JOIN [ 'pubsub-ascii' ]
DEBUG   pubsub  [email protected]/gossipsub.go:84 PEERUP: Add new peer QmSym7KNssvhzV2daVzCNgWttLx2MxBmjW8y7EgcPqrosb using /meshsub/1.0.0
libp2p:gossipsub rpc from QmQ5mRojZzj5SefRYKjWTxXbdT4SzPdHv38WZjP2yCFyTn
libp2p:gossipsub rpc from QmQ5mRojZzj5SefRYKjWTxXbdT4SzPdHv38WZjP2yCFyTn
libp2p:gossipsub HEARTBEAT: Add mesh link to QmQ5mRojZzj5SefRYKjWTxXbdT4SzPdHv38WZjP2yCFyTn in pubsub-ascii
WARN    pubsub  [email protected]/log.go:175    unexpected message from QmSym7KNssvhzV2daVzCNgWttLx2MxBmjW8y7EgcPqrosb
DEBUG   pubsub  [email protected]/gossipsub.go:459    HEARTBEAT: Add mesh link to QmSym7KNssvhzV2daVzCNgWttLx2MxBmjW8y7EgcPqrosb in pubsub-ascii
libp2p:gossipsub rpc from QmQ5mRojZzj5SefRYKjWTxXbdT4SzPdHv38WZjP2yCFyTn
2020-06-01T21:31:39.729Z libp2p:gossipsub GRAFT: Add mesh link from QmQ5mRojZzj5SefRYKjWTxXbdT4SzPdHv38WZjP2yCFyTn in pubsub-ascii

Found QmQ5mRojZzj5SefRYKjWTxXbdT4SzPdHv38WZjP2yCFyTn on attempt 2 in 1034ms
Found QmSym7KNssvhzV2daVzCNgWttLx2MxBmjW8y7EgcPqrosb on attempt 1 in 10ms
---> sending message
libp2p:gossipsub publish pubsub-ascii <Buffer 68 65 6c 6c 6f 20 77 6f 72 6c 64>
WARN    pubsub  [email protected]/log.go:175    unexpected message from QmSym7KNssvhzV2daVzCNgWttLx2MxBmjW8y7EgcPqrosb
libp2p:gossipsub connected QmQ5mRojZzj5SefRYKjWTxXbdT4SzPdHv38WZjP2yCFyTn

The key here seems to be the unexpected message warning go prints after js has printed libp2p:gossipsub publish indicating it's publishing a message:

WARN    pubsub  [email protected]/log.go:175    unexpected message from QmSym7KNssvhzV2daVzCNgWttLx2MxBmjW8y7EgcPqrosb

The warning seems to come from here. From what I can see it's reading a protobuf message out of the stream, then printing the 'unexpected message' warning and ignoring the message.

Incoming messages are handled by the handleNewStream method, handlePeerEOF is called from handleNewPeer which does open a stream to the new peer - should it be calling handleNewStream internally with that stream instead?

See https://github.com/libp2p/go-libp2p-pubsub/issues/133. go-libp2p-pubsub uses unidirectional streams. It _looks_ like js-libp2p-pubsub isn't. Is that correct?

cc @vyzo

Yes, go-libp2p-pubsub uses unidirectional streams.

See libp2p/go-libp2p-pubsub#133. go-libp2p-pubsub uses unidirectional streams. It _looks_ like js-libp2p-pubsub isn't. Is that correct?

Yes, js-libp2p-pubsub is using the same stream to receive and send pubsub messages: https://github.com/libp2p/js-libp2p-pubsub/blob/master/src/index.js#L201-L203

@vasco-santos Sounds like that might be incorrect? Can it be changed to implement the protocol properly and use unidirectional streams?

Taking into account that an implementation must have this into consideration, we should add a reference for this in the spec.

@vasco-santos Sounds like that might be incorrect? Can it be changed to implement the protocol properly and use unidirectional streams?

We have been using bidirectional streams, at least since the async await refactor. I am not sure on how this was implemented with pull streams.
I was not aware of go using unidirectional streams, but I get it after reading all the discussions referenced in libp2p/go-libp2p-pubsub#133.

I have created libp2p/js-libp2p-pubsub#45 to use unidirectional streams. This is keeping the outbound stream open for all the messages for now. It will need more changes, if we want to open one every time we want to push a message (which is what is being done in go as far as I understood).

This PR is on top of new breaking releases of the pubsub subsystem + routers, which need [email protected] in js-ipfs. I have tested it with both JS routers (needed to update the tests as they were expecting bidirectional streams), js-libp2p tests, and libp2p interop tests and this change seems to not have a negative impact anywhere.

I will now create a branch from the old versions of js-libp2p-pubsub and js-libp2p-gossipsub with this change, so that you can test it with ipfs interop tests. If this fixes the issue, I will be creating a release for the older versions so that this can be fixed without updating to the new libp2p.

@achingbrain can you retry installing the following branches in ipfs?

Let me know if this works, so that we can handle the reviews and releases

@vasco-santos running interop with js-ipfs depending on these PRs makes the ipns-pubsub and pubsub tests pass. Ship it!

[email protected] is shipped! You can go back to use the previous versions of libp2p-floodsub and libp2p-gossipsub. Their PRs just fix the tests

there is the larger issue that the latest version of js-pubsub is _not_ interoperable with go-pubsub; what were you guys thinking?

what were you guys thinking?

Interesting question. What I'm currently thinking is there's clearly no meaningful interop testing being done between go-libp2p and js-libp2p which is a gap that needs to be filled.

I can see there's an interop repo and it has an open issue about doing automated testing. Maybe this is something that go-libp2p and js-libp2p could add to their respective CI builds sooner rather than later.

Where automated testing is being done is at the ipfs level and the failing interop tests have so far prevented this PR being merged and js-ipfs claiming compatibility with go-ipfs 0.5.x.

The interop tests have been failing in go-ipfs master for months but for whatever reason they haven't been breaking builds which seems to just be a bug which has since been squashed.

we should add a reference for [unidirectional streams] to the spec.

Also interesting - stream setup seems pretty fundamental - is it documented anywhere outside of the go source code or GitHub comments? I can't see anything relevant in the spec.

I agree with @achingbrain

Just have a few thoughts to add:

  • I have been trying to keep up with manual interop tests in libp2p land for the time being

    • this has been difficult to achieve because go-libp2p release process does not include updating the daemon.

    • js-libp2p has a CI job for running the interop tests, but in go-libp2p is harder to get this, since we need to generate binaries for each environment beforehand, update the libp2p/npm-go-libp2p-dep, as a consequence of our current setup.

  • The full interop tests automation got deprioritized for now.

    • the main reason for this is the intention to use testground for the interop tests, meaning this would be duplicated work

I think that if we have go-libp2p releases coming with the go-libp2p-daemon updated would help a lot preventing these events. I currently watch the go-libp2p-daemon releases and once that happens, I manually get it to the interop. While this is not perfect and is currently a single point of failure (only me), I think this initial step would be helpful before we get the interop tests in testground.

We definitely need to add a reference to the spec about unidirectional streams.

@vyzo in the absence of testground being useable for interop tests, it'd also be incredibly helpful if you could work with @vasco-santos to figure out how to run the existing suite in CI for go & js libp2p as it sounds like he has to jump through a lot of hoops to do this at the moment. That would reduce the likelihood of this sort of bug reappearing.

Closing this issue as the interop tests are passing again.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

zignig picture zignig  Â·  3Comments

JesseWeinstein picture JesseWeinstein  Â·  4Comments

magik6k picture magik6k  Â·  3Comments

jonchoi picture jonchoi  Â·  3Comments

0x6431346e picture 0x6431346e  Â·  3Comments