Go-ipfs: Peer discovery regression on Windows

Created on 22 Jun 2018  路  21Comments  路  Source: ipfs/go-ipfs

Version information:

Regression between 0.4.13 and 0.4.14/0.4.15

Type:

Bug

Description:

On Windows, two nodes spawned on the same machine do not discover each other in 0.4.14 or 0.4.15, but they used to in 0.4.13.

The following demonstrates this error:

Install ipfsd-ctl and go-ipfs 0.4.13:

npm i ipfsd-ctl [email protected]

Create file test-peer-discovery.js and add the following:

const IpfsFactory = require('ipfsd-ctl')
const factory = IpfsFactory.create()

;[1, 2].forEach(id => {
  console.log(`node ${id} spawning...`)
  factory.spawn((err, ipfsd) => {
    if (err) throw err
    waitForPeers(id, ipfsd.api)
  })
})

function waitForPeers (id, node) {
  console.log(`node ${id} checking for peers...`)
  node.swarm.peers((err, peers) => {
    if (err) throw err
    console.log(`node ${id} has ${peers.length} peers`)
    if (!peers.length) return setTimeout(() => waitForPeers(id, node), 1000)
    peers.forEach(p => console.log(p.peer.toB58String()))
    node.stop()
  })
}

Run:

node test-peer-discovery.js

Output:

$ node test-peer-discovery.js
node 1 spawning...
node 2 spawning...
node 1 checking for peers...
node 2 checking for peers...
node 1 has 2 peers
QmSoLSafTMBsPKadTEgaXctDQVcqN88CNLHXMkTNwMKPnu
QmSoLer265NRgSp2LA3dPaeykiS1J6DifTC88f5uVQKNAd
node 2 has 2 peers
QmSoLPppuBtQSGwKDZT2M73ULpjvfd3aZ6ha4oFGL1KrGM
QmSoLer265NRgSp2LA3dPaeykiS1J6DifTC88f5uVQKNAd

Now install go-ipfs 0.4.14 or 0.4.15:

npm i [email protected]

Re-run, and output is now:

$ node test-peer-discovery.js
node 1 spawning...
node 2 spawning...
node 2 checking for peers...
node 1 checking for peers...
node 2 has 0 peers
node 1 has 0 peers
node 2 checking for peers...
node 1 checking for peers...
node 1 has 0 peers
node 2 has 0 peers
node 1 checking for peers...
node 2 checking for peers...
node 1 has 0 peers
node 2 has 0 peers
node 1 checking for peers...
node 2 checking for peers...
node 1 has 0 peers
node 2 has 0 peers
node 1 checking for peers...
node 2 checking for peers...
node 2 has 0 peers
node 1 has 0 peers
node 2 checking for peers...
node 1 checking for peers...
node 2 has 0 peers
node 1 has 0 peers
node 2 checking for peers...
node 1 checking for peers...
node 2 has 0 peers
node 1 has 0 peers
node 2 checking for peers...
node 1 checking for peers...
node 1 has 0 peers
node 2 has 0 peers
# ...forever

This seems to be the reason why our js-ipfs-api tests requiring connected nodes are failing.

kinbug

All 21 comments

Does this not happen on Linux? On a hunch... try waiting a minute first.

My guess is that this is happening because:

  1. The DHT waits 5m before it actually bootstraps: https://github.com/libp2p/go-libp2p-kad-dht/issues/163
  2. In 0.4.14, I modified the IPNS republisher to wait 1 minute before the initial republish: https://github.com/ipfs/go-ipfs/pull/4440

I believe this initial IPNS republish may have acted like a DHT bootstrap, making your node findable in the DHT (usually).

I haven't tried this exact example on linux, but on macOS it works on 0.4.15. Our js-ipfs-api tests are passing on linux and macOS also.

I have just tried leaving the example for over 10 minutes now and it's still reporting 0 peers.

I have Windows Defender turned _off_ btw.

I'll try the InitialRebroadcastDelay change when I can but I'm not too hopeful about that one now...

Hm. Damn. The usual rule here is: "when in doubt, blame reuseport" but that shouldn't be the issue here. However, just to check, try running with the environment variable IPFS_REUSEPORT set to false. That probably won't fix it, but we should rule it out.

@Stebalien this is an mdns issue, not a DHT issue

however, it looks like our mdns lib hasnt been updated in over a year.

@alanshaw in your example above for 0.4.13, the two nodes your nodes are getting connections to are not eachother, its two bootstrap nodes.

Looking into the js-ipfsd-ctl code, it seems that it replaces the config file uses with this one: https://github.com/ipfs/js-ipfsd-ctl/blob/master/src/defaults/config.json

This explicitly does not have mdns enabled, which requires a section like:

"Discovery": {
    "MDNS": {
      "Enabled": true,
      "Interval": 10
    }
  },

The weird bit though, is that it also doesnt have any bootstrap nodes. So i'm not sure why the 0.4.13 daemon would be connecting to them...

Does go-ipfs-dep use a precompiled IPFS or does it recompile it? Our MDNS library has some memory safety issues so the version of go it's compiled with may make a difference.

To address some of the comments above:

in your example above for 0.4.13, the two nodes your nodes are getting connections to are not each other, its two bootstrap nodes.

True! They do sometimes find each other though, but this can take many minutes. The other thing is that with 0.4.13 the nodes do accumulate peers over time (hundreds) after a minute:

screen shot 2018-06-25 at 13 30 35

With 0.4.15 no peers, nada, forever:

screen shot 2018-06-25 at 13 27 12

When I run this on macOS the output from the script is basically this every time:

screen shot 2018-06-25 at 13 25 28

Note: I modified the script a little, see below for updated version

This explicitly does not have mdns enabled, which requires a section like:

Yes, but when inspecting the config using ipfs.config.get() it does have that setting. Perhaps this is just a default that gets gets applied if you don't specify anything? Same for bootstrap nodes presumably...

Does go-ipfs-dep use a precompiled IPFS or does it recompile it?

It'll download a precompiled version if it has one for you system. Most node modules with a native aspect will do that and fallback to compiling on your system if not. I'm not sure if go-ipfs-dep does that. I'd hope so. In this case, a precompiled version IS being downloaded and used.


I modified the script a little for anyone who wants to try this. Adds peer IDs and colorizes the IDs for the two nodes we're spawning in red/blue.

// npm i async chalk ipfsd-ctl go-ipfs-dep
const auto = require('async/auto')
const IPFSFactory = require('ipfsd-ctl')
const chalk = require('chalk')
const factory = IPFSFactory.create()

console.log('spawning 2 nodes...')

auto({
  ipfsd0: (cb) => factory.spawn(cb),
  ipfsd1: (cb) => factory.spawn(cb),
  id0: ['ipfsd0', ({ ipfsd0 }, cb) => ipfsd0.api.id(cb)],
  id1: ['ipfsd1', ({ ipfsd1 }, cb) => ipfsd1.api.id(cb)]
}, (err, res) => {
  if (err) throw err

  res.ipfsd0.peerId = res.id0
  res.ipfsd1.peerId = res.id1

  res.ipfsd0.color = chalk.red
  res.ipfsd1.color = chalk.blue

  console.log(`spawned node with ID ${chalk.red(res.ipfsd0.peerId.id)}`)
  console.log(`spawned node with ID ${chalk.blue(res.ipfsd1.peerId.id)}`)

  waitForPeer(res.ipfsd0, res.ipfsd1)
  waitForPeer(res.ipfsd1, res.ipfsd0)
})

function waitForPeer (ipfsd, otherIpfsd) {
  const coloredId = (id) => {
    if (id === ipfsd.peerId.id) return ipfsd.color(id)
    if (id === otherIpfsd.peerId.id) return otherIpfsd.color(id)
    return id
  }

  ipfsd.api.swarm.peers((err, peers) => {
    if (err) throw err

    console.log(`node ${coloredId(ipfsd.peerId.id)} has ${peers.length} peers:`)
    peers.forEach((p, i) => {
      console.log(`  ${i + 1}. ${coloredId(p.peer.toB58String())}`)
    })

    const found = peers.some(p => p.peer.toB58String() === otherIpfsd.peerId.id)

    if (!found) return setTimeout(() => waitForPeer(ipfsd, otherIpfsd), 1000)
    ipfsd.stop()
  })
}

Another data point:

If I run ipfs daemon in Windows (go-ipfs 0.4.15) with all the defaults it's fine - I get a bunch of swarm peers really quickly. The only difference I can tell between doing it like this and using ipfsd-ctl is that daemons are spawned with swarm address /ip4/127.0.0.1/tcp/0. i.e. pick a port. Could that be an issue?

@alanshaw port zero shouldnt make a difference. The other difference i see is that ipfsd-ctl doesnt add any bootstrap nodes. No bootstrap nodes means no connections. Unless you are explicitly adding bootstrap nodes, seeing no connections is the expected behaviour. The behavior on 0.4.13 is the weird one to me

I've logged out the response from ipfs.config.get('Bootstrap') for both nodes:

0.4.13

screen shot 2018-06-25 at 21 03 19

0.4.15

screen shot 2018-06-25 at 21 01 49

I'm going to explicitly start the nodes with those bootstrap servers just incase ipfs.config.get is giving me back some defaults because nothing is set...will report back.

Explicitly adding those bootstrap addrs doesn't change anything.

If I set Bootstrap: [] then I see 0.4.13 behave exactly the same as 0.4.15.

Ok I think I have found the issue:

On Windows in 0.4.14 and 0.4.15 if you set Address.API to _any port other than 5001, including 0_, then the node will _never_ get any peers.

If you set Address.API: '/ip4/127.0.0.1/tcp/5001' then the peer with that API address _will_ acquire peers.

ipfsd-ctl sets this to /ip4/127.0.0.1/tcp/0 (https://github.com/ipfs/js-ipfsd-ctl/blob/master/src/defaults/config.json#L16) which is why nodes that it spawns never get any peers.

Try logging the commands on the IPFS daemon, running on both 5001 and on a different port. I'm guessing you'll see something run a command on the 5001 daemon but not the other. There's probably some test tool using the default port.

That is, try the command ipfs diag cmds.

Is there any workaround to get peers with API port different than 5001? Same issue on Linux.

@negamaxi this is probably not your issue.

@alanshaw have you made any progress on this?

Yes! I believe this is fixed in go-ipfs 0.4.16 馃帀

With 0.4.15 you can see the behaviour on windows if you take the following steps (I'm using the git bash so it's unix like):

  1. ipfs init
  2. vi ~/.ipfs/config and change Addresses config to:
    "Addresses": { "Swarm": [ "/ip4/127.0.0.1/tcp/0" ], "Announce": [], "NoAnnounce": [], "API": "/ip4/127.0.0.1/tcp/0", "Gateway": "/ip4/127.0.0.1/tcp/0" },
  3. ipfs daemon &
  4. ipfs swarm peers - no peers - ever!

By default ipfs init will give you a swarm address in your config like /ip4/0.0.0.0/tcp/4001 - if you change the config in the steps above to use 0.0.0.0 instead of 127.0.0.1 then it works.

If you do the exact same steps using go-ipfs 0.4.16 it works, so I believe this is fixed.

Ah. That makes more sense. It has nothing to do with the API port, right? Looks like https://github.com/libp2p/go-libp2p/blob/master/NEWS.md#dialing-and-source-addresses.

Closing as that issue has been fixed.

Was this page helpful?
0 / 5 - 0 ratings