Every time lightningd starts, it says this:
2018-04-07T14:43:44.998Z lightning_gossipd(32734): STATUS_FAIL_INTERNAL_ERROR: Announcement 0100be25012839603159d89806034f4b4ac40bcd05bf70a8753887d513820099f0ed13d9c44083a8ed1999f79b7abaa279d099ade31c78ffefdac5495a0fc652cd87e052e7f31261023db8953cb030444ed7124eb9b66d512d544b37f8246c86f6897fee91d0be2b588a081e4334d1976d90ee1e14edd2443368760f076aa16f12a7b112d5fa7e93f37edf0e65b359a00ebb0b8d105a907e4ebfabd62c10ac21333a542832137c0b58b2afe3317c46d517f4aac9b23697fe357cc7a95e914ac10961f2c549fbd8564da62d1e282a3342855e95be99c2709c4a8bbd7c769cd9c55d070788cf8887e5fff4cc967bd0b9c8935644b79955f122b476443e87e95ae5b60f00006fe28c0ab6f1b372c1a6a246ae63f74f931e8365e15a089c68d619000000000007c70100029800000371190acfb2e92bd1faa6ce4d12ff248798515a92ac903ac14d31b5172d9b2917037b7cd33aada9dd91ccb95857d8f71c34fd31c426057016f28f48b0633ae844cd028ad3936869eb28a6296ee18611621937b895e4b696bfa62e430e3746ea06ae5b023cdaad34e871ab8f3dbbc5ad9a806ae56d27937dfa3903634312e84970ce7bd2 was replaced?
This happened first on 2018-04-06T09:12:50.115Z, with the following at the top of the git log file. By the way, why is my name there? And why does it make me edit a file every time I issue the command "git pull origin master"? Who wrote GIT, a bunch of upside down people? It is very confusing. I understand version control perfectly, but the words GIT uses are backwards and confusing.
commit 35b0d1f28eaaa7c195cfd4e5aa637e1e1b5b907c
Merge: fc54a8e 9ab28d1
Author: Ulmo
Date: Fri Apr 6 09:07:49 2018 +0000
Merge branch 'master' of https://github.com/ElementsProject/lightning
commit 9ab28d1659359eb38c8e508de95c5473eb6d8227
Author: Christian Decker decker.christian@gmail.com
Date: Thu Apr 5 18:09:46 2018 +0200
topology: Two off-by-one error when catching up with the blockchain
There are two very hard problems in software engineering:
1. Off-by-one errors
In this case we were rolling back further than needed and we were starting the
catchup one block further than expected.
Signed-off-by: Christian Decker <[email protected]>
getinfo outputHere's the start of the log output:
lightningd v0.5.2-2016-11-21-2496-g35b0d1f
I have the same bug now, with latest git merge, with this:
"version" : "v0.5.2-2016-11-21-2501-g09fdab8"
While looking at that, the block height was 191108, 191127, 191139, 191147, 191165, 191173, then crashed. My other node shows block height 517074. The latter seems more correct.
The prior working version I used had latest log entries of:
commit fc54a8e82b806ca47518b2f5cd32d1a3e5b3d857
Merge: 117fc91 e41d565
Author: Ulmo
Date: Sat Mar 31 00:21:35 2018 +0000
Merge branch 'master' of https://github.com/ElementsProject/lightning
commit e41d5651031c7e23caf9db8c33b14208eb3c41da
Author: Christian Decker decker.christian@gmail.com
Date: Wed Mar 28 13:02:09 2018 +0200
pytest: Add a test for the gossip perstistence
Signed-off-by: Christian Decker <[email protected]>
Apparently this is duplicate of #1334. He called it duplicate, but I called it by the name of the crash error: "Announcement ... was replaced?"
So, what logs can I look up?
I copied .lightning (including gossip_store) and deleted gossip_store to restart, since I want my node up.
Now, lightningd is running, but block height is wrong.
Here's the bitcoind block height: "blocks": 517077,
lightningd starts with the first block added in the logfile being block 188112.
Here's my lightningd: "blockheight" : 188123
Then some minutes later: "blockheight" : 188621,
Wait, is it starting fresh from an early block whence Lightning Network was already up and running long ago? Shouldn't it restart with a later blockheight?
Anyway, now I checked again, and it's at 190512. It will get to 517077 eventually, I suppose. Uh, no: the latest block it checked, 191205, was "Timestamp | 2012-07-28 13:16:28"; that would take forever to catch up. Why is lightningd starting way back in the 188 thousands?
Repeated restarts can result in the rescan moving backwards on each start. You can reset it to something reasonable by doing the following:
sqlite3 $HOME/.lightning/lightningd.sqlite3 "UPDATE vars SET val= 516000 WHERE name='last_processed_block';"
That exact shell command with the UPDATE statement included worked (I just copy pasted it after a STOP command).
It would be lovely if block start point:
A simple hard-coded switch statement should suffice, with some comments about the rigorousness of the start block number, so that someone who knows the first manual test transaction can go ahead and update the code to explicitly specify that block number.
If someone told me those block #s and the right variables to check and the right subroutine to do the check in, I'd be happy to do the code, but then again, that's like 10 lines of code, so ... not that hard.
Shit. Getting;
2018-04-09T03:41:22.067Z lightning_gossipd(9296): STATUS_FAIL_INTERNAL_ERROR: Announcement 0100b5a73369bda6f1189f7495498ce64a6d10ec420dcea8dd8728f034abffc1dcbf7aea571115454af1d554e5f6f78843d6b4b4ea5ef78b7275242b5f04b7306711a352d75524af4c75206016ce2c90416073670fbe1dea6733756f55a25374098d6c94e440d05482463d3f4c8babdcccbcc58851479456d5c0f892e1b1aff4fff4a128a693ea891dc932b2eb45dff20bba207f56b0521dc7171780aa719ff867b13a755feabb455216039980122fa7518e88da7cde6deff1d7da8eb9f7ff14bb940a0fe5da9680f4e882efdf1a463612a419021844091cbb285a8863254dd6d29722b62fe182e7bbb23b979d92d46afa09f4a24a6a4e3842ecc586d561c97c2ebc00006fe28c0ab6f1b372c1a6a246ae63f74f931e8365e15a089c68d619000000000007d5470000fb0000034fd0be2b82f609f30593d2c5b7f27eb33c5e84f660446e4bf1501c838ce052d403cbf298b068300be33f06c947b9d3f00a0f0e8089da3233f5db37e81d3a596fe103ca9a9a722c59f846c58b2e2e81ae3f34bcdea340cb2c662044e52846e24a579303c031552ccff16445d669f6bc33dfc6b7d45997987613b1dcecfa1f1d3cbea15a was replaced?
Ran the sqlite3 command.
Pulled and compiled master on Apr 8th
@Xian001, as workaround, for now please remove gossip_store file in lightningdir, that should let your node recover.
Just did that, and seems to be loading up fine now. Thanks !
node_id_1=0371190acfb2e92bd1faa6ce4d12ff248798515a92ac903ac14d31b5172d9b2917
node_id_2=037b7cd33aada9dd91ccb95857d8f71c34fd31c426057016f28f48b0633ae844cd
Are either of these your own node? We're trying to figure this one out: we should never try to put duplicate announcements into the broadcast map, which is why this error occurred.
@rustyrussell not sure if you're asking myself, or OP, but neither of those two nodes are my own.
BTW @Ulmo you can probably blow away that old merge by resetting your git tree using git reset --hard origin/master.
@Xian001 Sorry, was asking OP ( @Ulmo ). For you the question is:
node_id_1=034fd0be2b82f609f30593d2c5b7f27eb33c5e84f660446e4bf1501c838ce052d4
node_id_2=03cbf298b068300be33f06c947b9d3f00a0f0e8089da3233f5db37e81d3a596fe1
Are either of these you?
@rustyrussell Oops. Apologies for muddying the waters. I can confirm neither of those two nodes belong to me.
PR #1352 was just merged, which should have addressed this issue. Could you update and check if this has been fixed?
Inactive for 15 days, as far as we know the issue has been fixed, closing...
Most helpful comment
@Xian001, as workaround, for now please remove
gossip_storefile in lightningdir, that should let your node recover.