Go-ethereum: Synchronisation failed: Dropping peer

Created on 31 Aug 2017 · 30Comments · Source: ethereum/go-ethereum

System information

Geth version: 1.6.7
OS & Version: Linux (Ubuntu 17.04)

Actual behaviour

After several hours of working geth fails to sync any further

INFO [08-31|21:29:22] Imported new chain segment blocks=1 txs=87 mgas=6.720 elapsed=310.079ms mgasps=21.670 number=4224370 hash=a3660d…b7b5fb
INFO [08-31|21:29:41] Imported new chain segment blocks=1 txs=117 mgas=3.595 elapsed=467.968ms mgasps=7.683 number=4224371 hash=5ae6f1…edc294
INFO [08-31|21:30:10] Imported new chain segment blocks=1 txs=61 mgas=6.711 elapsed=175.206ms mgasps=38.301 number=4224372 hash=5abb29…2a0891
WARN [08-31|21:30:15] message loop peer=d8cb8306a528cf96 err=EOF
WARN [08-31|21:31:05] Synchronisation failed, dropping peer peer=11fdde20fc7831ef err="retrieved hash chain is invalid"
WARN [08-31|21:31:45] Synchronisation failed, dropping peer peer=c300581e16c7d233 err=timeout
WARN [08-31|21:32:30] Synchronisation failed, dropping peer peer=938199d61038ff42 err="retrieved hash chain is invalid"
WARN [08-31|21:35:01] Synchronisation failed, dropping peer peer=0fc5fe924314d328 err="retrieved hash chain is invalid"

After client restart, geth manages to sync latest blocks, after few hours issue repeats.
I'm running geth with flags
--rpc --shh --maxpeers 100 --lightserv 90 --cache 2048

Issue appeared today - I was running client in background for days (or even weeks) without any problems earlier

Source

mcgravier

👍4

Most helpful comment

Ubuntu 17.10 (latest, fresh install)
geth 1.7.3-stable-4bb3c89d (latest)

Problem with "Synchronisation failed, dropping peer" INFO [12-05|17:09:16] Imported new chain segment INFO [12-05|17:09:28] Imported new chain segment WARN [12-05|17:09:28] Synchronisation failed, retrying WARN [12-05|17:09:38] Synchronisation failed, dropping peer WARN [12-05|17:09:55] Synchronisation failed, dropping peer WARN [12-05|17:10:04] Synchronisation failed, dropping peer INFO [12-05|17:10:37] Imported new chain segment INFO [12-05|17:10:56] Imported new chain segment WARN [12-05|17:10:56] Synchronisation failed, retrying WARN [12-05|17:11:03] Synchronisation failed, dropping peer WARN [12-05|17:11:14] Synchronisation failed, dropping peer WARN [12-05|17:11:26] Synchronisation failed, retrying WARN [12-05|17:11:56] Synchronisation failed, dropping peer WARN [12-05|17:12:14] Synchronisation failed, dropping peer WARN [12-05|17:14:04] Synchronisation failed, dropping peer WARN [12-05|17:14:46] Synchronisation failed, retrying WARN [12-05|17:14:56] Synchronisation failed, dropping peer WARN [12-05|17:15:04] Synchronisation failed, dropping peer WARN [12-05|17:15:12] Synchronisation failed, dropping peer WARN [12-05|17:15:54] Synchronisation failed, dropping peer WARN [12-05|17:16:38] Synchronisation failed, retrying WARN [12-05|17:16:48] Synchronisation failed, dropping peer WARN [12-05|17:16:59] Synchronisation failed, dropping peer WARN [12-05|17:17:06] Synchronisation failed, dropping peer WARN [12-05|17:17:26] Synchronisation failed, retrying i sync to the latest block, 20-30 min later i get this error, 5-10 min he tryies to reconnect then i have to sync again, i am 100-200 blocks behind while he was retrying to connect, this erorr repeats every 20-30 min. When i dont get this error i have the latest block, when i get this error it always leaves me 100-200 blocks behind because he wasnt connected, its annoying.
blocks=1 txs=119 mgas=6.713 elapsed=18.631s mgasps=0.360 number=4680097 hash=99e572…37a955 blocks=1 txs=160 mgas=6.712 elapsed=12.155s mgasps=0.552 number=4680098 hash=c6ca8f…20651f err="block body download canceled (requested)" peer=409d9b45abc2a211 err=timeout peer=45a8a36e755912da err=timeout peer=4ae3f639e2ada120 err=timeout blocks=1 txs=97 mgas=6.727 elapsed=14.629s mgasps=0.460 number=4680099 hash=f897ce…5da529 blocks=1 txs=137 mgas=6.727 elapsed=18.696s mgasps=0.360 number=4680100 hash=28cb99…5c82e9 err="block body download canceled (requested)" peer=45a8a36e755912da err=timeout peer=6cce04b224a22d57 err=timeout err="block body download canceled (requested)" peer=8af1d7b7928e93bb err=timeout peer=7ec5e61d504cce17 err=timeout peer=b7a2cafcbbb6d497 err=timeout err="block download canceled (requested)" peer=4bfd5c539119fba6 err=timeout peer=d87932c26878f725 err=timeout peer=3b6c09a5391927d1 err=timeout peer=0b78929ee2b1db7a err=timeout err="block download canceled (requested)" peer=500c559573dcb6df err=timeout peer=d87932c26878f725 err=timeout peer=25dcd2766622c1d8 err=timeout err="block body download canceled (requested)"

dimcoderx on 5 Dec 2017

👍9

All 30 comments

update:
Getting this when closing client:

ERROR[08-31|22:27:00] Failed to close database database=/home/adam/.ethereum/geth/chaindata err="leveldb/table: corruption on data-block (pos=1849286): checksum mismatch, want=0x66919f5e got=0x875cb029 [file=5152408.ldb]"

mcgravier on 31 Aug 2017

Same on ubuntu 16.04 LTS

geth 1.7.0-stable-6c6c7b2a

arbach on 22 Sep 2017

Have a look at this: https://github.com/ethereum/go-ethereum/issues/15001... probably related

wtfiwtz on 23 Sep 2017

Same on Ubuntu 16.04, with geth-linux-amd64-1.7.0-6c6c7b2a
__

It turned out that my problem was at the block 4370000, so update to 1.7.3 can solve the problem.
Reason is in here: https://github.com/ethereum/go-ethereum/issues/15265

yi-ji on 26 Nov 2017

Ubuntu 17.10 (latest, fresh install)
geth 1.7.3-stable-4bb3c89d (latest)

Problem with "Synchronisation failed, dropping peer" i sync to the latest block, 20-30 min later i get this error, 5-10 min he tryies to reconnect then i have to sync again, i am 100-200 blocks behind while he was retrying to connect, this erorr repeats every 20-30 min. When i dont get this error i have the latest block, when i get this error it always leaves me 100-200 blocks behind because he wasnt connected, its annoying.
INFO [12-05|17:09:16] Imported new chain segment blocks=1 txs=119 mgas=6.713 elapsed=18.631s mgasps=0.360 number=4680097 hash=99e572…37a955 INFO [12-05|17:09:28] Imported new chain segment blocks=1 txs=160 mgas=6.712 elapsed=12.155s mgasps=0.552 number=4680098 hash=c6ca8f…20651f WARN [12-05|17:09:28] Synchronisation failed, retrying err="block body download canceled (requested)" WARN [12-05|17:09:38] Synchronisation failed, dropping peer peer=409d9b45abc2a211 err=timeout WARN [12-05|17:09:55] Synchronisation failed, dropping peer peer=45a8a36e755912da err=timeout WARN [12-05|17:10:04] Synchronisation failed, dropping peer peer=4ae3f639e2ada120 err=timeout INFO [12-05|17:10:37] Imported new chain segment blocks=1 txs=97 mgas=6.727 elapsed=14.629s mgasps=0.460 number=4680099 hash=f897ce…5da529 INFO [12-05|17:10:56] Imported new chain segment blocks=1 txs=137 mgas=6.727 elapsed=18.696s mgasps=0.360 number=4680100 hash=28cb99…5c82e9 WARN [12-05|17:10:56] Synchronisation failed, retrying err="block body download canceled (requested)" WARN [12-05|17:11:03] Synchronisation failed, dropping peer peer=45a8a36e755912da err=timeout WARN [12-05|17:11:14] Synchronisation failed, dropping peer peer=6cce04b224a22d57 err=timeout WARN [12-05|17:11:26] Synchronisation failed, retrying err="block body download canceled (requested)" WARN [12-05|17:11:56] Synchronisation failed, dropping peer peer=8af1d7b7928e93bb err=timeout WARN [12-05|17:12:14] Synchronisation failed, dropping peer peer=7ec5e61d504cce17 err=timeout WARN [12-05|17:14:04] Synchronisation failed, dropping peer peer=b7a2cafcbbb6d497 err=timeout WARN [12-05|17:14:46] Synchronisation failed, retrying err="block download canceled (requested)" WARN [12-05|17:14:56] Synchronisation failed, dropping peer peer=4bfd5c539119fba6 err=timeout WARN [12-05|17:15:04] Synchronisation failed, dropping peer peer=d87932c26878f725 err=timeout WARN [12-05|17:15:12] Synchronisation failed, dropping peer peer=3b6c09a5391927d1 err=timeout WARN [12-05|17:15:54] Synchronisation failed, dropping peer peer=0b78929ee2b1db7a err=timeout WARN [12-05|17:16:38] Synchronisation failed, retrying err="block download canceled (requested)" WARN [12-05|17:16:48] Synchronisation failed, dropping peer peer=500c559573dcb6df err=timeout WARN [12-05|17:16:59] Synchronisation failed, dropping peer peer=d87932c26878f725 err=timeout WARN [12-05|17:17:06] Synchronisation failed, dropping peer peer=25dcd2766622c1d8 err=timeout WARN [12-05|17:17:26] Synchronisation failed, retrying err="block body download canceled (requested)"

dimcoderx on 5 Dec 2017

👍9

Same problem, ubuntu 16.04

doexclusive on 10 Dec 2017

Same issue Ubuntu 16.04 - is there a solution for this ? Geth version 1.7.3 for me

raahil190 on 14 Dec 2017

Same issue here Geth 1.7.3 on Windows 10

Spacefish on 17 Dec 2017

same issue here Ubuntu 16.04 ,Version: 1.7.3-stable
Is there a solution for this ?

MarkWh1te on 2 Jan 2018

Debian 8, Geth 1.7.3-stable, the same issue. First time occured yesterday. Geth was running for about two weeks continuously before the problem occured

maxvgi on 5 Jan 2018

Ubuntu 16.04, Geth 1.7.3-stable (commit 4bb3c89d44e372e6a9ab85a8be0c9345265c763a)

Same issue, if a peer has to be dropped because of timeout, the whole blockchain sync freezes for ~1-2 minutes, messing up the whole process (as opposed to gently dropping one bad peer and downloading from others)

bogatyy on 5 Jan 2018

Ubuntu 16.04, running inside docker on AWS with geth 1.7.3-stable having the same problem. We now moved our geth node into a different datacenter (not aws) and synching is working stable the last few days. Are you maybe also running inside AWS and have this issues?

DZDomi on 6 Jan 2018

Ubuntu 16.04, geth 1.7.3-stable same issue. I'm running it on GCP, but it throws the same error on my Ubuntu 16.04 workstation at home.

userpasta on 9 Jan 2018

DebIan 8 64bit, geth 1.7.3-stable
Same issue, I can't catch the latest block, always about a hundred blocks behind

greensea on 10 Jan 2018

Have same issue in Docker ethereum/client-go:v1.7.2

nikashitsa on 13 Jan 2018

Same issue on macos high sierra on ropsten.

t=25f1f73de08fa4e5ebaac8cfe7b18fbfdba4598048543e01ca957f180ca98609
INFO [01-17|03:42:32] Finished upgrading chain index           type=cht
WARN [01-17|03:42:43] message loop                             peer=9e99e183b5c71d51 err=EOF
INFO [01-17|03:43:08] Block synchronisation started 
WARN [01-17|03:43:11] Ancestor below allowance                 peer=c2f9fdd74dd62c55 number=92421  hash=000000…000000 allowance=92421
WARN [01-17|03:43:11] Synchronisation failed, dropping peer    peer=c2f9fdd74dd62c55 err="retrieved ancestor is invalid"

agrcrobles on 17 Jan 2018

Same thing here, Ubuntu 16.04 geth 1.8.0-unstable go1.9.2
It is 100-200 blocks behind and didn't get actual highest block only sometimes when restarting geth, but then also not reach the highest block and is stuck 100-200 blocks behind.

Zumili on 17 Jan 2018

Same issue Geth 1.8 Ubuntu 16.04
It synced ~24M tries then stopped, I restarted geth but the state reset to zero. I was waiting for three days. what a pain!

sleimana on 12 Mar 2018

Please look at this detailed description of the issue: https://github.com/ethereum/go-ethereum/issues/15001#issuecomment-370732526

wtfiwtz on 14 Mar 2018

Same issue on Geth 1.8.6. I get

geth[21609]: WARN [04-23|18:52:21] Synchronisation failed, retrying         err="block download canceled (requested)"
Apr 23 18:52:39 ip-xxx-xx-xx-xx geth[21609]: WARN [04-23|18:52:39] Synchronisation failed, dropping peer"

after which the blockchain falls out of sync by 10-100 blocks. This happens every time sync catches up.

**downgrading to Geth 1.8.3 solved my problem for 3 weeks, before displaying same issues

GeeeCoin on 23 Apr 2018

@wtfiwtz this is not explained by the @karalabe explanation in https://github.com/ethereum/go-ethereum/issues/15001#issuecomment-370732526. I would love to see an optimization that addresses the problem you succinctly laid out here https://github.com/ethereum/go-ethereum/issues/14647#issuecomment-378203501. There has to be a suitable built-in alternative to hosting numerous independent nodes in order to have robustness. Either a node waits too long for a response or drops a peer too quickly. I haven't dug into the code enough (nor am I a golang developer) to figure out which is the case.

There should be a solution to help prevent settling into a degraded network state where most peers in a subgroup are now all behind together. This echo chamber situation should trigger a clearing of peers and reboot from boot/static nodes. Maybe this is too difficult to implement, but would go a very long way if possible. Perhaps it involves a nearest-neighbor analysis which should is possible, given that all your peers' connections are discoverable.

GeeeCoin on 18 May 2018

Experiencing the same issue on Ubuntu 18.04 LTS (Bionic Beaver) with a brand new SSD + 8GB RAM. I've been looping around block 1.3M for the last 48 hours (nearly 80 hours total sync time so far and after many other previous attempts) using: geth --syncmode "fast" --cache 2048. For whatever reason, I also can't sync Parity due to similar issues.

It's been nearly a year now since this issue was posted and there doesn't appear to be a working solution. I don't believe it can be safely ignored as a byproduct of inefficient HDDs. Very many Ubuntu/Windows users have reported similar cases with consumer-grade Laptops/PCs + SSDs.

Given the hardware centralization risks, security compromises, and UX nightmare this implies, shouldn't more dev. resources be allocated to find a solution? I've run Geth and Parity nodes in 2016/2017 and never faced this level of obstruction with syncing the chain. Please fix this.

aliensyntax on 28 May 2018

Having the same issue as OP. I have a DAPP running on live. People pay but I can't detect the payment as my geth stops syncing (with Sync Failed error).
I have to check daily if my geth is running, if not, just restarting makes it fine.
But this is frustrating and I am scared how long this can go on. If I forgot to check my geth node, I wake up with emails of canceled txs (due to timeout in Dapp).

Is there any fine solution to this? Seems everyone is stuck at this hell once in life.

prashantprabhakar on 31 May 2018

Also experiencing this issue (On Windows + SSD (!)). Has there been made any progress on finding the cause of this issue?

Daz0k on 12 Jun 2018

Related to https://github.com/ethereum/go-ethereum/issues/16825 with temp solution

Atrides on 12 Jun 2018

It seems, that problem can be caused by memory corruption - in my case after tuning RAM to more conservative settings and resyncing from scratch, everything finally started working fine

mcgravier on 12 Jun 2018

@mcgravier what was your reasoning behind your conclusion it was caused by memory corruption?
Could you also explain what you exactly changed "tuning RAM to more conservative settings".
Might help me and others

mathieumagalhaes on 4 Jul 2018

@mathieumagalhaes

I have PC with Ryzen 7 processor and 3200mhz memory. However this particular processor is guaranteed to work with 2666mhz memory - everything beyond is considered an overclocking and may not be stable. Running memtest for 24h - it reported memory errors - so I've reduced memory clock from 3200mhz to 2666mhz and all issues disappeared - I can now run Geth node for extensive amount of time without getting the issue

mcgravier on 4 Jul 2018

I have this same issue with GCP. I have tried firewall settings (allowing UDP and TCP traffic), rebooting, changing RAM/CPUs, etc. I believe memory corruption would be unlikely, since I assume GCP gives me different hardware every time I reboot and/or change the amount of memory allocated to my VM.

Has anyone looked into firewall TCP session timeouts? I am wondering if there are some common firewall settings (could be on ether local firewall, or peer firewall) that cause connections to be dropped if they are inactive for a certain length of time (10 minutes on GCP), and whether it is possible that Geth would run into that in the course of normal operations. I am currently testing my theory on GCP by changing keepalive settings, but given the intermittent nature of this issue it seems like it would be difficult to be sure I am on the right track. I am wondering if anyone else has looked into this.

bbeeley on 8 Jul 2018

Sorry, closing this because the report isn't actionable. There is no single bug in geth that causes sync failures. We are aware that sync may sometimes fail for networking reasons.