Go-ethereum: geth not connecting to bootnodes on private test network

Created on 19 Aug 2015  路  18Comments  路  Source: ethereum/go-ethereum

Hi I have noticed that for geth on a private net, setting bootnodes works fine at first but eventually (a few hours later) setting bootnodes will not do anything. Peers added manually can sync up fine though.

I see this for bootnodes running on both AWS and DO instances with same issues? Perhaps its a byproduct of a network that is too small?

Most helpful comment

--nodiscover
Use this to make sure that your node is not discoverable by people who do not manually add you. Otherwise, there is a chance that your node may be inadvertently added to a stranger's node if they have the same genesis file and network id

looks like adding nodes manually is the best way to go, both to ensure they are deterministically added and to prevent the issue of spooky nodes.
.

All 18 comments

We are running our private net on GCE and I too have similar issues. I am running geth 1.0.1. My private net was running fine and I just checked. My mining node is at different block and my other non-mining node has stopped syncing. It ran for few days without issue.

After killing the nodes and restarting them the block synchronization has restarted and now both nodes are synced up.

@ckeenan Please explain what you mean by "setting bootnodes". It is important to note that geth does not maintain a persistent connection to the bootstrap nodes. They are only used as the first point of contact for the node discovery protocol.

@fjl geth --bootnodes="enode://..." .... Discovery indeed goes fine when shutting down & starting that client on occasion (testing out the problem) for the first hour or so with the bootnodes parameter set. Eventually a shutdown and startup will not connect to the bootnode anymore even though the parameter is set. Any ideas?

Having this same problem until I add the peer via admin.addPeer. Here's what I'm putting into geth

geth --port 30304 --rpc --rpcaddr 127.0.0.1 --rpcport 8101 --rpccorsdomain http://127.0.0.1:8000 --genesis config/development/genesis.json --datadir datadir/development/ --networkid 5473 --bootnodes=鈥渆node://3ede10cd6a5e382db43804f3267c5a6eab6021e245b6eb28d1d9b11720638490e876ce5e61bd76992befa89935c5a34f139df7dda82b00d5ea09da6f96f77839@40.117.36.75:30303" --unlock 42c35b6c220d570bd8898a540406e0b026479f7b --password config/development/password console

I'm able to work around it with

admin.addPeer('enode://3ede10cd6a5e382db43804f3267c5a6eab6021e245b6eb28d1d9b11720638490e876ce5e61bd76992befa89935c5a34f139df7dda82b00d5ea09da6f96f77839@40.117.36.75:30303')

Geth
Version: 1.3.3
Protocol Versions: [63 62 61]
Network Id: 1
Go Version: go1.5.3
OS: darwin
GOPATH=
GOROOT=/usr/local/Cellar/go/1.5.3/libexec

@aakilfernandes the clock on one of the nodes could be out of sync. Try sudo ntpdate -s time.nist.gov. Or see the Frontier Guide Connecting to the Network: Common Problems With Connectivity

@ckeenan that was it. Thanks!

Actually @ckeenan, all interested, looks like the problem is back unfortunately. I think the issue is that bootnodes aren't automatically used as peers. when the problem went away, there was a 3rd node in the mix. So I think what's happening is geth searches the bootnodes for peers. If the bootnode doesn't have any peers, it doesn't connect. However geth should use the bootnode itself as a peer if available.

Just a random thought. It does this for avoiding bootnode (potentially to be important public well-known node) being overload?

I noticed some details from setting verbosity=6, that one bootnodes won't specifically working for a private network, instead, it works for all different network including testnet and main chain. So eventually private nodes won't be able to seek each other via this bootnode.

Geth 1.6 has a feature that if the node cannot find any good peers for 30 seconds, it will try to connect to the bootnode itself. This should help short term. Long term we need to enable discovery v5 on the eth protocol for test/private networks, which should solve the issue properly.

@karalabe Well, the problem was that, it just doesn't connect even after 30 seconds.
@immartian Well, are you talking about the official bootnodes? I have this problem with my own nodes as bootnodes.

I can't connect to any peer using --bootnodes flag and I also used static-nodes.json file.No use it returns 0 when I used net.peerCount

I'm having the same problem. Except that the first peer could actually connect to the boot node (both see each other as peers). I'm testing using docker, but I don't think this would add any additional problem to the mix.

Can someone please confirm this is an issue or a "feature" (that is,a well known issue that for some reason is the expected behavior). Thanks in advance.

@pablochacin Please describe your exact setup. We don't see any issues on either mainnet or testnets (which are for all intents and purposes "private" networks).

@karalabe Never mind. I left it running some time (> 30min) and eventually the peers found each other. Now I need to figure out how to be sure this has happened before I test anything (like whisper messaging). Any suggestion here would be appreciated.

As a side note, to my surprise, I found an external node (with public ip) also listed as peer. How could this happen? I'm using network id 15.

172.17.0.2:50294 <- docker container (bootnode)
178.238.233.123:30303 <- external node
172.17.0.4:59714 <- docker container (peer)
172.17.0.5:30303 <- docker container (peer)

--nodiscover
Use this to make sure that your node is not discoverable by people who do not manually add you. Otherwise, there is a chance that your node may be inadvertently added to a stranger's node if they have the same genesis file and network id

looks like adding nodes manually is the best way to go, both to ensure they are deterministically added and to prevent the issue of spooky nodes.
.

@pablochacin Unless you add the --nodiscover flag when you launch your geth instance, any other node using the same genesis file and network ID can peer with the nodes on your private testnet. Since 15 isn't a large, random number, I'd say that there are some folks out there that are probably using that network ID and likely have the same settings configured in their custom genesis file.

Edit: Sorry, didn't see your last comment, but yeah. XD

Was this page helpful?
0 / 5 - 0 ratings