Go-ipfs: `swarm connect` timeout

Created on 4 Jun 2019  Â·  15Comments  Â·  Source: ipfs/go-ipfs

Version information:

ipfs version --all
go-ipfs version: 0.4.21-
Repo version: 7
System version: amd64/linux
Golang version: go1.12.5
uname -a
Linux ubuntucloud185 4.15.0-33-generic #36-Ubuntu SMP Wed Aug 15 16:00:05 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Description:

I have two machine. let me call them A and B. After I daemon them like this:

root@B:~# ipfs daemon --init
Initializing daemon...
go-ipfs version: 0.4.21-
Repo version: 7
System version: amd64/linux
Golang version: go1.12.5
initializing IPFS node at /root/.ipfs
generating 2048-bit RSA keypair...done
peer identity: QmcjiJNZ5JTKsBZNaoRStLyP7oLLyHzZVGFKTychxhHHQX
to get started, enter:

    ipfs cat /ipfs/QmS4ustL54uo8FzR9455qaxZwuMiUhyvMcX9Ba8nUH4uVv/readme

10:06:02.041 ERROR    p2pnode: mdns error:  could not determine host IP addresses for ubuntucloud185. discovery.go:46
Swarm listening on /ip4/127.0.0.1/tcp/4001
Swarm listening on /ip4/172.17.0.1/tcp/4001
Swarm listening on /ip4/(public IP)/tcp/4001
Swarm listening on /ip6/::1/tcp/4001
Swarm listening on /p2p-circuit
Swarm announcing /ip4/127.0.0.1/tcp/4001
Swarm announcing /ip4/172.17.0.1/tcp/4001
Swarm announcing /ip4/223.111.147.185/tcp/4001
Swarm announcing /ip6/::1/tcp/4001
API server listening on /ip4/127.0.0.1/tcp/5001
WebUI: http://127.0.0.1:5001/webui
Gateway (readonly) server listening on /ip4/127.0.0.1/tcp/8080
Daemon is ready

I try to connect A from B, it always timeout:

root@B:~# ipfs swarm connect /ip4/(public 
 IP of A)/tcp/4001/ipfs/QmZwfRydFZrL9ARqpqKLjP8b8EYQcDPYD6DxyxKmxbYGsd
Error: connect QmZwfRydFZrL9ARqpqKLjP8b8EYQcDPYD6DxyxKmxbYGsd failure: failed to dial : all dials failed
  * [/ip4/(public 
 IP of A)/tcp/4001] dial tcp4 0.0.0.0:4001->(public 
 IP of A):4001: i/o timeout

so I check this network environment like this:

root@B:~# curl  (public of A):5001/api/v0/id
{"ID":"QmZwfRydFZrL9ARqpqKLjP8b8EYQcDPYD6DxyxKmxbYGsd","PublicKey":"CAASpgIwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDSmNgJeYVcNRI8Kl1ZUM2/vHKJ5X1pELzOpUx4KOd2z1nDEw70Cq4Y+8+wOMEgovx+ed+XZn+ai9Olc8ssHOrlR+r67WUhe2UnV7vn01hvJxFjXlqljSN3vcP/KgBYHgO8Su4/P1Rel4c1eNHYYH2F92BGm3rQn10HQyA2WROTuFFGDcIz5yt8/oWOubpYUJTiJxCTQtCy81PQDVH2JP7T4w9uIXGtvPb4d5vSxzjHlHwiWULLl+6wt7EhuJ4fuuR2cU2Zqv5jyOUXDr0Usct/5lRe0PJXCPT2XO4pIXJygVUWu+iyh1d+aen8kuK8u91E59hl624AR4tDbNCqzMErAgMBAAE=","Addresses":...}

root@B:~# telnet (public of A) 4001
Trying  (public of A)...
Connected to  (public of A).
Escape character is '^]'.
/multistream/1.0.0

And then I try connect B from A, it timeout too. Why?

kinbug kinquestion

Most helpful comment

Ok, this looks like something in the network is blocking the traffic. You could _try_ running IPFS on a higher port, you could _also_ try listening on the websocket protocol instead to see if that works.

Try changing the addresses in Addresses.Swarm as follows:

  1. Replace the port with a high-numbered port (pick a random number, e.g., 49192).
  2. Append /ws to these addresses. This will force libp2p to use the websocket transport.

Unfortunately, if that doesn't work, there's not much I can do.

All 15 comments

I try to connect A from B, it always timeout:

That looks like a connection attempt from B to A. I assume that was a mistake?

so I check this network environment like this:
curl (public of A):5001/api/v0/id

The API isn't exposed on the public interface, are you sure you're not invoking these commands from machine A?


Note: I can connect to your machine A so it's definitely working.

That looks like a connection attempt from B to A. I assume that was a mistake?

Yes, All my test code is on machine B.

The API isn't exposed on the public interface, are you sure you're not invoking these commands from machine A?

I have exposed 5001 on 0.0.0.0 on machine A, So I can invoke this commands from machine B.

Yes, All my test code is on machine B.

Ah, got it, I misread your examples.

I'm not sure what's happening. Are both of these machines running go-ipfs 0.4.21?

I have exposed 5001 on 0.0.0.0 on machine A, So I can invoke this commands from machine B.

Off topic: I wouldn't expose the API to the public internet. It's designed to be private.

Yes, in order to ensure that the problem reappears, I use the latest release.

And, I found this problem mainly because of machine B, because I tried to connect C to A, which is fine.

What should I do? This problem has been bothering me for several days.

Approximately how many seconds does it take for nc PUBLIC_IP_OF_A 4001 to return the /multistream/1.0.0? (trying to figure out the latency).

If you can tell me, what country is B in? Some countries perform deep packet inspection to mess with IPFS connections.

Approximately how many seconds does it take for nc PUBLIC_IP_OF_A 4001 to return the /multistream/1.0.0? (trying to figure out the latency).

I test this in B, it not return forever. And test this in C, it return very fast.

If you can tell me, what country is B in? Some countries perform deep packet inspection to mess with IPFS connections.
machine A, B, and C are all in China.

Was this not running on machine B?

root@B:~# telnet (public of A) 4001
Trying  (public of A)...
Connected to  (public of A).
Escape character is '^]'.
/multistream/1.0.0

machine A, B, and C are all in China.

I've heard rumors that China is performing deep packet inspection to block IPFS connections. That's likely the issue here.

Was this not running on machine B?

I have run this on B. it is all right.

I've heard rumors that China is performing deep packet inspection to block IPFS connections. That's likely the issue here.

If that's the case, it's not something I can solve. But this situation rarely occurs

I have run this on B. it is all right.

... and ...

I test this in B, it not return forever.

Does it work or not?

I mean on machine B, run nc PUBLIC_IP_OF_A 4001 was not return forever, but telnet (public of A) 4001 was ok.

And, the swarm connect timeout issue was still exist

How long does it take _telnet_ to return (trying to measure latency)? That's really odd as telnet versus nc shouldn't make a difference.

I am realy sorry for my mistake. Both nc and telnet not work on Machine B.

root@ipfs47188:~# udfs swarm connect /ip4/PUBLIC_IP_OF_A/tcp/4001/ipfs/QmYtMoTxTMCETEwJgB3JAz8Ysx8D9pScH9HojPCMK6KPwz
Error: connect QmYtMoTxTMCETEwJgB3JAz8Ysx8D9pScH9HojPCMK6KPwz failure: dial attempt failed: <peer.ID Qm*yoCpae> --> <peer.ID Qm*K6KPwz> dial attempt failed: context deadline exceeded
root@ipfs47188:~# ls
udfs-installer  ulordblocks71.tar.gz
root@ipfs47188:~# nc PUBLIC_IP_OF_A 4001
^C
root@ipfs47188:~# telnet PUBLIC_IP_OF_A 4001
Trying PUBLIC_IP_OF_A...
^C

Ok, this looks like something in the network is blocking the traffic. You could _try_ running IPFS on a higher port, you could _also_ try listening on the websocket protocol instead to see if that works.

Try changing the addresses in Addresses.Swarm as follows:

  1. Replace the port with a high-numbered port (pick a random number, e.g., 49192).
  2. Append /ws to these addresses. This will force libp2p to use the websocket transport.

Unfortunately, if that doesn't work, there's not much I can do.

Thanks, I try running IPFS on a higher port solved this issue.
Although I don't know exactly why。

GFW, probably.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

jonchoi picture jonchoi  Â·  3Comments

funkyfuture picture funkyfuture  Â·  3Comments

kallisti5 picture kallisti5  Â·  3Comments

lidel picture lidel  Â·  3Comments

Mikaela picture Mikaela  Â·  3Comments