Js-ipfs: PubSub discovery not working?

Created on 13 Oct 2019  路  11Comments  路  Source: ipfs/js-ipfs

  • Master (pulled today)
  • WSL

Type:

Question

Severity:

Medium

Description:

Steps to reproduce the error:

I have been following this project for a while as I've wanted to write a fully decentralized application using OrbitDB, and I've recently noticed that the DHT has been merged into master and the gossipsub protocol has been implemented, so I thought this might finally be feasible.

Instead of using OrbitDB right away I tried to use the js-ipfs pubsub feature directly, as it's the underlying system, to see if things were working the way I expected, in the simplest possible case.
I installed from master using NPM on different computers and wrote a small script which is running on node.js

I ran this on two different computers in my home network. Both are subscribed to a topic, only one is publishing messages. The two nodes find each other automatically and the second node is receiving the messages sent by the first. They're in each other's peers lists, which means they are connected (I assume). Everything is fine.

I tried the same through the internet.
The subscriber/consumer is on a remote computer, it doesn't work. I figure it might be related to firewalls or something. So I run the subscriber on a server, it still won't find the publisher.

Then I try making the publisher the server, and my local computer the subscriber, and that still doesn't seem to work.
They actually have mutual peers in their list (at least in the last case), but they're not connecting to each other.

My understanding is that they should be using the DHT to find other nodes subscribed to the same topic. It works on my local network but not through the internet, even after waiting for 15 minutes.
I have set the discovery: true option in the subscribe function, but it has no effect.

Am I doing something wrong? How do I accomplish this without needing a central server (which seems to be the whole point of IPFS)?
I want my various arbitrary clients to be able to connect to each other automatically, and to subscribe to arbitrary topics (those being OrbitDB databases) and of course this is going to be running in the browser (but that should not be a problem in itself).

Thank you for your help.

Most helpful comment

Yeah, I haven't been able to try this solution yet, but this discussion has solved all my doubts about how to implement my application. I hope others facing a similar problem will find it useful as well.

All 11 comments

The DHT isn't enabled by default and if you enable it you'll likely run into #2438

libp2p is finishing off it's pull-stream -> async iterator refactor, once that's done there will be some time to finish off the DHT implementation.

Update: I have tried to see if the nodes can find each other's addresses using the DHT, given the node IDs. That seems to work.
Using ipfs.dht.findPeer(peerID) on the ipfs instance running on my server, or the one running on my local PCs, I get results with no issues. I can find the server from the PCs and vice versa. Which means the DHT is working correctly on all of them.

However it seems for pubsub they still can't find each other automatically, unless they're in the same local network.

@achingbrain thank you for that piece of information. Regarding the DHT not being enabled, I figured this out between making my first post, and the one I just made, and I tried enabling it, but it didn't fix my issue.

Regarding the issue with it crashing, I understand this is all very rough code, but even if it's not stable if I can get it to work I can proceed with writing my own app while bugs are fixed upstream.

On the local network your nodes are using MDNS to discover each other. This is why things work here - it is not because of pubsub discovery.

I tried the same through the internet.
The subscriber/consumer is on a remote computer, it doesn't work. I figure it might be related to firewalls or something. So I run the subscriber on a server, it still won't find the publisher.

A remote subscriber simply won't be able to connect to your local node regardless of whether it can discover it or not.

On your local network your node is likely behind a NAT, and if so, external nodes won't be able to dial and connect to your node. So even if your local node was using the DHT and was discoverable on it, the external node would not have been able to connect to it (since it's address is likely something like /ip4/192.168.x.x/tcp/5002/ipfs/QmHash).

Then I try making the publisher the server, and my local computer the subscriber, and that still doesn't seem to work.
They actually have mutual peers in their list (at least in the last case), but they're not connecting to each other.

The subscriber would have to connect to the publisher in this case due to the aforementioned NAT issue.

I'm not _super_ familiar with the ins and outs of gossipsub but in floodsub the mutual peers would need to be running pubsub and be subscribed to the same topic(s) for messages to be passed. Provided those conditions are true then it is a bug if messages _aren't_ being passed.

My understanding is that they should be using the DHT to find other nodes subscribed to the same topic. It works on my local network but not through the internet, even after waiting for 15 minutes.
I have set the discovery: true option in the subscribe function, but it has no effect.

I don't know the status of pubsub discovery in floodsub/gossipsub - by the looks of it the discovery: true option isn't being passed to subscribe which might be a bug, or it might be out of date documentation and maybe it actually needs to be passed to the pubsub module constructor. The latter feels the most likely to me.

@vasco-santos / @jacobheun can you clarify the situation please? Should @reasv open an issue in https://github.com/libp2p/js-libp2p or https://github.com/ChainSafe/gossipsub-js or somewhere else?

Am I doing something wrong? How do I accomplish this without needing a central server (which seems to be the whole point of IPFS)?

I don't think so, provided you understand the limitations I've outlined here.

I want my various arbitrary clients to be able to connect to each other automatically, and to subscribe to arbitrary topics (those being OrbitDB databases) and of course this is going to be running in the browser (but that should not be a problem in itself).

Well, ha, in the browser it's even more of a problem because of same origin policy as well as secure context which will prevent you making non-secure websocket connections (WS not WSS) from HTTPS sites. i.e. the nodes you connect to have to have an SSL cert if your web application is accessed via HTTPS.

@vasco-santos / @jacobheun can you clarify the situation please? Should @reasv open an issue in https://github.com/libp2p/js-libp2p or https://github.com/ChainSafe/gossipsub-js or somewhere else?

Currently, there is no automatic discovery of topics, the DHT would need to be queried directly. This also requires nodes to register as a provider of their topic with the DHT, which also does not currently happen automatically. https://github.com/libp2p/js-libp2p would be the best place to submit an issue for this. We are looking at improving the interactions between the subsystems and improving automatic discovery, but it's not quite there yet.

Currently, there is no automatic discovery of topics, the DHT would need to be queried directly. This also requires nodes to register as a provider of their topic with the DHT, which also does not currently happen automatically. https://github.com/libp2p/js-libp2p would be the best place to submit an issue for this. We are looking at improving the interactions between the subsystems and improving automatic discovery, but it's not quite there yet.

Cool, so basically I just need to manually register my nodes in the DHT, and then query for the topic manually, and then connect to them I suppose? That seems simple enough.
Any pointers on how that should be done?
I don't have a problem with getting the other nodes' information from the DHT given their ID, which I don't know in advance (of course), but how do I register as a subscriber of a particular topic, so a completely unrelated node who wants to participate the same topic can find my ID?

The second part of the problem is actually connecting to the nodes once I have "found" them:

Well, ha, in the browser it's even more of a problem because of same origin policy as well as secure context which will prevent you making non-secure websocket connections (WS not WSS) from HTTPS sites. i.e. the nodes you connect to have to have an SSL cert if your web application is accessed via HTTPS.

My understanding was that WebRTC-star could solve this issue. Of course WebSockets is not really gonna work for this, but isn't WebRTC functional already in IPFS (https://github.com/libp2p/js-libp2p-webrtc-star)?
I am not sure about how this can be used in IPFS, but if two nodes both connect to the same signalling server, can't they just directly connect with each other using WebRTC, and pubsub would automatically work through it?

The main problem is that individual users can create their own arbitrary topics (by creating OrbitDB databases), so using a centralized node that is subscribed to the specific topics and relays all messages is not really an option, as the topics would be added and removed all the time.
Plus it is centralized, since it requires everything to go through a particular server node, or a set of them.

Ideally each browser client is independent and can find the others through IPFS' DHT by connecting to any ("real") IPFS node, and then connect to them directly with WebRTC, and make PubSub work that way.
Is this possible at all currently?
Alternatively, would it be possible to make a compromise where each client can connect to an arbitrary IPFS node, such as the ones in the bootstrap set, then find the IDs of other nodes subscribed to the topic through the DHT, then connect to them using /p2p-circuit/ID even if they're connected to different parts of the network (ie no common peer), and then messages are relayed that way? This seems like the most feasible option right now, although a direct connection would be faster and more decentralized.

Cool, so basically I just need to manually register my nodes in the DHT, and then query for the topic manually, and then connect to them I suppose? That seems simple enough.
Any pointers on how that should be done?
I don't have a problem with getting the other nodes' information from the DHT given their ID, which I don't know in advance (of course), but how do I register as a subscriber of a particular topic, so a completely unrelated node who wants to participate the same topic can find my ID?

An approach to this would be that anytime you subscribe to a topic, you do a DHT provide with that topic name. You would also do a findProviders call on the topic as well, so you get back a list of all peers that are currently registered as providers for the topic.

My understanding was that WebRTC-star could solve this issue. Of course WebSockets is not really gonna work for this, but isn't WebRTC functional already in IPFS (https://github.com/libp2p/js-libp2p-webrtc-star)?

WebRTC star isn't currently bundled with IPFS, as it had some issues in the past. It will also be sunset in the not too distant future, as we are working on a distributed signaling spec for it. You could customize your libp2p node to use webrtc-star in the interim.

An approach to this would be that anytime you subscribe to a topic, you do a DHT provide with that topic name. You would also do a findProviders call on the topic as well, so you get back a list of all peers that are currently registered as providers for the topic.

Thanks, that's more or less what I thought I needed to do. I didn't know you could "provide" an arbitrary thing though, I thought it had to be the CID of a file. I'm gonna do this.

Originally I was thinking of doing a kind of hack where every node that's participating in the topic publishes a file on IPFS which contains the name of the topic, then searches for all other providers of that file, and connects to them. Basically a roundabout way of doing what you suggested.

WebRTC star isn't currently bundled with IPFS, as it had some issues in the past. It will also be sunset in the not too distant future, as we are working on a distributed signaling spec for it. You could customize your libp2p node to use webrtc-star in the interim.

I see. That's fine. I won't use WebRTC-star for now then. As an alternative to that, my nodes could just connect to any IPFS relay node, find the IDs of other subscribers, and connect to them using /p2p-circuit/ipfs/ID style addresses like I had planned. I think that should work, and bypasses the need for a direct connection, while preserving decentralization, as my nodes don't need to connect to any particular IPFS server node.

@reasv can we consider this issue resolved for now?

Yeah, I haven't been able to try this solution yet, but this discussion has solved all my doubts about how to implement my application. I hope others facing a similar problem will find it useful as well.

Hey @jacobheun @reasv
This was a very useful discussion.

An approach to this would be that anytime you subscribe to a topic, you do a DHT provide with that topic name. You would also do a findProviders call on the topic as well, so you get back a list of all peers that are currently registered as providers for the topic.

There is one small issue like when I try to start a second node in my system, It takes a lot of time to provide the topic name for the second node. It also takes alot of time to findproviders.

Why does this happen? And how can I make it quick.

Now we have the list of peer ids, which have subscribed for a topic.
I want to send messages to all the peer ids now. How should I do that? I am thinking like somehow make a connection between a publisher node and the rest of the nodes in the list of peer id to send message to each peer directly.

How to do it? How can it be done in a better way?

Was this page helpful?
0 / 5 - 0 ratings