It looks like CI ignores failing interop tests:
This doesn't seem great because without passing tests there's no guarantee the code does what it claims to do and can then prevent the release of downstream projects that do not ignore failing interop tests.
It looks like an issue with the mocha reporter we were using. I'm working on a fix in https://github.com/ipfs/go-ipfs/pull/7395.
Reopening because #7395 doesn't solve the problem and the interop tests are still failing against master.
Following on from the discussion in #7394 running the tests with pubsub logging turned on shows this sort of thing (js is prefixed with libp2p:gossipsub, go with DEBUG and WARN):
Successful:
DEBUG pubsub [email protected]/gossipsub.go:304 JOIN pubsub-non-ascii
WARN pubsub [email protected]/log.go:180 bootstrap: error providing rendezvous for pubsub-non-ascii: failed to find any peer in table
libp2p:gossipsub rpc from QmQ5mRojZzj5SefRYKjWTxXbdT4SzPdHv38WZjP2yCFyTn
libp2p:gossipsub JOIN [ 'pubsub-non-ascii' ]
libp2p:gossipsub JOIN: Add mesh link to QmQ5mRojZzj5SefRYKjWTxXbdT4SzPdHv38WZjP2yCFyTn in pubsub-non-ascii
DEBUG pubsub [email protected]/gossipsub.go:216 GRAFT: Add mesh link from QmSym7KNssvhzV2daVzCNgWttLx2MxBmjW8y7EgcPqrosb in pubsub-non-ascii
Found QmQ5mRojZzj5SefRYKjWTxXbdT4SzPdHv38WZjP2yCFyTn on attempt 1 in 22ms
Found QmSym7KNssvhzV2daVzCNgWttLx2MxBmjW8y7EgcPqrosb on attempt 1 in 3ms
---> sending message
libp2p:gossipsub publish pubsub-non-ascii <Buffer e4 bd a0 e5 a5 bd e4 b8 96 e7 95 8c>
✓ should exchange non ascii data (41ms)
Failure:
DEBUG pubsub [email protected]/gossipsub.go:304 JOIN pubsub-ascii
WARN pubsub [email protected]/log.go:180 bootstrap: error providing rendezvous for pubsub-ascii: failed to find any peer in table
libp2p:gossipsub JOIN [ 'pubsub-ascii' ]
DEBUG pubsub [email protected]/gossipsub.go:84 PEERUP: Add new peer QmSym7KNssvhzV2daVzCNgWttLx2MxBmjW8y7EgcPqrosb using /meshsub/1.0.0
libp2p:gossipsub rpc from QmQ5mRojZzj5SefRYKjWTxXbdT4SzPdHv38WZjP2yCFyTn
libp2p:gossipsub rpc from QmQ5mRojZzj5SefRYKjWTxXbdT4SzPdHv38WZjP2yCFyTn
libp2p:gossipsub HEARTBEAT: Add mesh link to QmQ5mRojZzj5SefRYKjWTxXbdT4SzPdHv38WZjP2yCFyTn in pubsub-ascii
WARN pubsub [email protected]/log.go:175 unexpected message from QmSym7KNssvhzV2daVzCNgWttLx2MxBmjW8y7EgcPqrosb
DEBUG pubsub [email protected]/gossipsub.go:459 HEARTBEAT: Add mesh link to QmSym7KNssvhzV2daVzCNgWttLx2MxBmjW8y7EgcPqrosb in pubsub-ascii
libp2p:gossipsub rpc from QmQ5mRojZzj5SefRYKjWTxXbdT4SzPdHv38WZjP2yCFyTn
2020-06-01T21:31:39.729Z libp2p:gossipsub GRAFT: Add mesh link from QmQ5mRojZzj5SefRYKjWTxXbdT4SzPdHv38WZjP2yCFyTn in pubsub-ascii
Found QmQ5mRojZzj5SefRYKjWTxXbdT4SzPdHv38WZjP2yCFyTn on attempt 2 in 1034ms
Found QmSym7KNssvhzV2daVzCNgWttLx2MxBmjW8y7EgcPqrosb on attempt 1 in 10ms
---> sending message
libp2p:gossipsub publish pubsub-ascii <Buffer 68 65 6c 6c 6f 20 77 6f 72 6c 64>
WARN pubsub [email protected]/log.go:175 unexpected message from QmSym7KNssvhzV2daVzCNgWttLx2MxBmjW8y7EgcPqrosb
libp2p:gossipsub connected QmQ5mRojZzj5SefRYKjWTxXbdT4SzPdHv38WZjP2yCFyTn
The key here seems to be the unexpected message warning go prints after js has printed libp2p:gossipsub publish indicating it's publishing a message:
WARN pubsub [email protected]/log.go:175 unexpected message from QmSym7KNssvhzV2daVzCNgWttLx2MxBmjW8y7EgcPqrosb
The warning seems to come from here. From what I can see it's reading a protobuf message out of the stream, then printing the 'unexpected message' warning and ignoring the message.
Incoming messages are handled by the handleNewStream method, handlePeerEOF is called from handleNewPeer which does open a stream to the new peer - should it be calling handleNewStream internally with that stream instead?
See https://github.com/libp2p/go-libp2p-pubsub/issues/133. go-libp2p-pubsub uses unidirectional streams. It _looks_ like js-libp2p-pubsub isn't. Is that correct?
cc @vyzo
Yes, go-libp2p-pubsub uses unidirectional streams.
See libp2p/go-libp2p-pubsub#133. go-libp2p-pubsub uses unidirectional streams. It _looks_ like js-libp2p-pubsub isn't. Is that correct?
Yes, js-libp2p-pubsub is using the same stream to receive and send pubsub messages: https://github.com/libp2p/js-libp2p-pubsub/blob/master/src/index.js#L201-L203
@vasco-santos Sounds like that might be incorrect? Can it be changed to implement the protocol properly and use unidirectional streams?
Taking into account that an implementation must have this into consideration, we should add a reference for this in the spec.
@vasco-santos Sounds like that might be incorrect? Can it be changed to implement the protocol properly and use unidirectional streams?
We have been using bidirectional streams, at least since the async await refactor. I am not sure on how this was implemented with pull streams.
I was not aware of go using unidirectional streams, but I get it after reading all the discussions referenced in libp2p/go-libp2p-pubsub#133.
I have created libp2p/js-libp2p-pubsub#45 to use unidirectional streams. This is keeping the outbound stream open for all the messages for now. It will need more changes, if we want to open one every time we want to push a message (which is what is being done in go as far as I understood).
This PR is on top of new breaking releases of the pubsub subsystem + routers, which need [email protected] in js-ipfs. I have tested it with both JS routers (needed to update the tests as they were expecting bidirectional streams), js-libp2p tests, and libp2p interop tests and this change seems to not have a negative impact anywhere.
I will now create a branch from the old versions of js-libp2p-pubsub and js-libp2p-gossipsub with this change, so that you can test it with ipfs interop tests. If this fixes the issue, I will be creating a release for the older versions so that this can be fixed without updating to the new libp2p.
@achingbrain can you retry installing the following branches in ipfs?
Let me know if this works, so that we can handle the reviews and releases
@vasco-santos running interop with js-ipfs depending on these PRs makes the ipns-pubsub and pubsub tests pass. Ship it!
[email protected] is shipped! You can go back to use the previous versions of libp2p-floodsub and libp2p-gossipsub. Their PRs just fix the tests
there is the larger issue that the latest version of js-pubsub is _not_ interoperable with go-pubsub; what were you guys thinking?
what were you guys thinking?
Interesting question. What I'm currently thinking is there's clearly no meaningful interop testing being done between go-libp2p and js-libp2p which is a gap that needs to be filled.
I can see there's an interop repo and it has an open issue about doing automated testing. Maybe this is something that go-libp2p and js-libp2p could add to their respective CI builds sooner rather than later.
Where automated testing is being done is at the ipfs level and the failing interop tests have so far prevented this PR being merged and js-ipfs claiming compatibility with go-ipfs 0.5.x.
The interop tests have been failing in go-ipfs master for months but for whatever reason they haven't been breaking builds which seems to just be a bug which has since been squashed.
we should add a reference for [unidirectional streams] to the spec.
Also interesting - stream setup seems pretty fundamental - is it documented anywhere outside of the go source code or GitHub comments? I can't see anything relevant in the spec.
I agree with @achingbrain
Just have a few thoughts to add:
go-libp2p release process does not include updating the daemon.js-libp2p has a CI job for running the interop tests, but in go-libp2p is harder to get this, since we need to generate binaries for each environment beforehand, update the libp2p/npm-go-libp2p-dep, as a consequence of our current setup.I think that if we have go-libp2p releases coming with the go-libp2p-daemon updated would help a lot preventing these events. I currently watch the go-libp2p-daemon releases and once that happens, I manually get it to the interop. While this is not perfect and is currently a single point of failure (only me), I think this initial step would be helpful before we get the interop tests in testground.
We definitely need to add a reference to the spec about unidirectional streams.
@vyzo in the absence of testground being useable for interop tests, it'd also be incredibly helpful if you could work with @vasco-santos to figure out how to run the existing suite in CI for go & js libp2p as it sounds like he has to jump through a lot of hoops to do this at the moment. That would reduce the likelihood of this sort of bug reappearing.
Closing this issue as the interop tests are passing again.