Lnd: synced_to_chain flips to false after initial sync w/ neutrino

Created on 23 Dec 2017  ยท  23Comments  ยท  Source: lightningnetwork/lnd

Shortly after running lnd for the first time, completing initial sync with neutrino, and connecting to a peer, the synced_to_chain value from getinfo shows false and I'm unable to accept incoming openchannel requests. I was also able to observe that synced_to_chain was actually true for a while before switching to false. If I restart lnd, synced_to_chain immediately shows true and I'm able to accept incoming channels without issue.

Here are two getinfo calls performed minutes apart from each other. Between the two calls I had performed connect and openchannel from another node to this one. It's worth noting that the block height increased by 1, I'd imagine it's possible that after initial neutrino sync a new block could erroneously set synced_to_chain to false?

mcnallydp@instance-4:~$ lncli getinfo
{
    "identity_pubkey": "02f4f1bd135d049b420f3283a896135db8a89d57800d141db249fde01ae55a6993",
    "alias": "",
    "num_pending_channels": 0,
    "num_active_channels": 0,
    "num_peers": 1,
    "block_height": 1255728,
    "block_hash": "000000004f3e3486cac07bf46a72021295d15bbfc850bce3b9d559ada6bdf527",
    "synced_to_chain": true,
    "testnet": true,
    "chains": [
        "bitcoin"
    ]
}
mcnallydp@instance-4:~$ lncli getinfo
{
    "identity_pubkey": "02f4f1bd135d049b420f3283a896135db8a89d57800d141db249fde01ae55a6993",
    "alias": "",
    "num_pending_channels": 0,
    "num_active_channels": 0,
    "num_peers": 2,
    "block_height": 1255729,
    "block_hash": "00000000311a1d36f0946c071083d35a08862d5bdd919a89759ef0a196208434",
    "synced_to_chain": false,
    "testnet": true,
    "chains": [
        "bitcoin"
    ]
}

Here's the error I got from my other instance when trying to open a channel.

mcnallydp@instance-1:~$ lncli openchannel 02f4f1bd135d049b420f3283a896135db8a89d57800d141db249fde01ae55a6993 100000 50000
[lncli] rpc error: code = Code(102) desc = Synchronizing blockchain

And lastly here is the relevant lnd output on the affected instance:

2017-12-23 07:14:40.014 [INF] LNWL: Done catching up block hashes
2017-12-23 07:14:40.014 [INF] LNWL: Finished rescan for 1 address (synced to block 000000004f3e3486cac07bf46a72021295d15bbfc850bce3b9d559ada6bdf527, height 1255728)
2017-12-23 07:14:40.014 [INF] CMGR: Server listening on [::]:9735
2017-12-23 07:14:40.158 [INF] DISC: Obtained 3 addrs to bootstrap network with
2017-12-23 07:14:40.260 [ERR] SRVR: unable to connect to 02293770106161698b573f575f4fe05bd89ffbc41972da43fbe9b7f1b978cca4dc@98.253.181.104:9735: EOF
2017-12-23 07:14:40.317 [ERR] SRVR: unable to connect to 038e1c0dee8a07504e9870e16e22162a8c3c51698ba051e5d1ca14b0f3cc2afee8@45.63.87.131:9735: EOF
2017-12-23 07:14:40.355 [ERR] SRVR: unable to connect to 02f88685f00151c172a83c6aa82ba145a6fa714ddf25914d21c0b2ad89d62abc16@52.18.61.50:9735: EOF
2017-12-23 07:14:55.159 [INF] DISC: Attempting to bootstrap with: Authenticated Channel Graph
2017-12-23 07:14:55.160 [INF] DISC: Attempting to bootstrap with: BOLT-0010 DNS Seed: [[nodes.lightning.directory soa.nodes.lightning.directory]]
2017-12-23 07:14:55.302 [INF] DISC: Obtained 6 addrs to bootstrap network with
2017-12-23 07:14:55.330 [ERR] SRVR: unable to connect to 03c6250c815a1647175e33e12ab648bd5ef3fd14298ccee549ee0bedbd842d3bba@45.77.115.33:9735: EOF
2017-12-23 07:14:55.487 [ERR] SRVR: unable to connect to 03ecbc773fa731ace170d57cbba1874c2a8e8637f53ffd30a5990da2d1eccc1d0f@45.63.87.131:9735: EOF
2017-12-23 07:14:55.519 [ERR] SRVR: unable to connect to 02f88685f00151c172a83c6aa82ba145a6fa714ddf25914d21c0b2ad89d62abc16@52.18.61.50:9735: EOF
2017-12-23 07:14:55.520 [ERR] SRVR: unable to connect to 038a43a313bed89b858e95241fdfe9a884001f78c8db251a742ae72633b243ede3@52.18.61.50:9735: EOF
2017-12-23 07:14:55.520 [INF] SRVR: Established connection to: 136.144.149.95:9735
2017-12-23 07:14:55.546 [ERR] SRVR: unable to connect to 0209a91b81315b3d9abb2b0a02f34039e1f6e4129476fcfa59bac0f509fe5b29aa@195.181.243.196:9735: EOF
2017-12-23 07:15:10.203 [INF] DISC: Attempting to bootstrap with: Authenticated Channel Graph
2017-12-23 07:15:10.217 [INF] DISC: Attempting to bootstrap with: BOLT-0010 DNS Seed: [[nodes.lightning.directory soa.nodes.lightning.directory]]
2017-12-23 07:15:10.585 [INF] DISC: Obtained 4 addrs to bootstrap network with
2017-12-23 07:15:10.786 [ERR] SRVR: unable to connect to 033bc8e9c5f56677d6e6c7a0338908a9bfb9e47b2d7bce9c08ab3c25341023bcac@45.63.87.131:9735: EOF
2017-12-23 07:15:10.787 [ERR] SRVR: unable to connect to 039757e4466c57948cdbd8f5e9cbb05266ee421638cb4a82db5b8ae363fd4c3a3a@45.63.87.131:9735: EOF
2017-12-23 07:15:10.887 [ERR] SRVR: unable to connect to 02499de5a4d2b3bf9fa4b0d0705480eb81600502d19fcd3eb7c273bc1d107349c0@195.181.243.196:9735: EOF
2017-12-23 07:15:10.907 [ERR] SRVR: unable to connect to 0209a91b81315b3d9abb2b0a02f34039e1f6e4129476fcfa59bac0f509fe5b29aa@195.181.243.196:9735: EOF
2017-12-23 07:16:30.162 [INF] SRVR: New inbound connection from 35.185.13.214:53490
2017-12-23 07:16:34.172 [INF] BTCN: Processed 7721 blocks in the last 1m57.8s (height 1255729, 2017-12-23 07:16:08 +0000 UTC)
2017-12-23 07:17:10.403 [ERR] FNDG: Failing funding flow: (*lnwire.Error)(0xc430a0bc00)({
 ChanID: (lnwire.ChannelID) (len=32 cap=32) db2883c8fa1c6d8401fdead963c9301fbd6e67d287296993d381ac709a294406,
 Data: (lnwire.ErrorData) (len=1 cap=1) {
  00000000  02                                                |.|
 }
})

2017-12-23 07:17:10.404 [INF] SRVR: Attempting to send msgs 1 to: 023c1fc1ef2d9a555693cf0fdc39b66d934dfc07bc0b1ee050fd3f4aeb773b9438
2017-12-23 07:17:10.404 [INF] FNDG: Cancelling funding reservation for node_key=023c1fc1ef2d9a555693cf0fdc39b66d934dfc07bc0b1ee050fd3f4aeb773b9438, chan_id=db2883c8fa1c6d8401fdead963c9301fbd6e67d287296993d381ac709a294406

I've observed this happen twice, although a third time where I tried to recreate the steps the issue does not occur (unfortunately that was the one time I had profiling turned on). Both instances were debian stretch running the latest btcd/roasbeef and lnd, using go 1.8 and running with neutrino.

P2 backend

All 23 comments

This seems like an issue we should try to get to the bottom of, I cannot tell from the provided logs why that last block would not Could you enable debug logs in case this happens again? :)

I haven't tried any initial syncs until the past day or two. With neutrino sync again I'm seeing synced_to_chain false even after a complete sync. I have debug logs enabled. Here's some of what I"m seeing.

2018-02-06 04:51:21.270 [DBG] BTCN: Sending getheaders (locator 0000000000000478bc4f248e002c67f22490ffe322de52a7bc9bab99d2521231, stop 00000000000060b6521a3beab4677f169c33d3fb31a8dd1873876329e093e3b0) to 159.203.125.125:18333 (outbound)
2018-02-06 04:51:21.289 [DBG] BTCN: Received headers (num 1) from 159.203.125.125:18333 (outbound)
2018-02-06 04:51:21.289 [INF] BTCN: Processed 1 block in the last 1m33.76s (height 1278205, 2018-02-06 04:51:04 +0000 UTC)

1278205 is the most recent block. And yet synced_to_chain shows false. I'm also sitting at 0 peers, which I take it is because lnd thinks it's still syncing. I've tried restarting lnd several times and I'm using the latest from master.

{
    "identity_pubkey": "020c828d3b4390f5fa99d36abb2a8af57ff32ec2ec30cfbc841c487b4f0e41e9e0",
    "alias": "020c828d3b4390f5fa99",
    "num_pending_channels": 0,
    "num_active_channels": 0,
    "num_peers": 0,
    "block_height": 1278205,
    "block_hash": "00000000000060b6521a3beab4677f169c33d3fb31a8dd1873876329e093e3b0",
    "synced_to_chain": false,
    "testnet": true,
    "chains": [
        "bitcoin"
    ],
    "uris": [
        "020c828d3b4390f5fa99d36abb2a8af57ff32ec2ec30cfbc841c487b4f0e41e9e0@35.185.13.214:9735"
    ]
}

I just resynced this node by deleting block_headers.bin ext_filter_headers.bin reg_filter_headers.bin from my data folder. After a full sync synced_to_chain still showed false. After restarting lnd, synced_to_chain still showed false when I first checked it but a few seconds later flipped to true even though the block height remained the same.

I just reproduced this issue with another node, the steps are to start a full sync from scratch with neutrino. Although now restarting lnd isn't making synced_to_chain flip. The lnd output is the following lines repeated seemingly endlessly.

2018-02-06 07:16:44.824 [DBG] BTCN: Received cfilter from 159.203.125.125:18333 (outbound)
2018-02-06 07:16:44.967 [DBG] BTCN: Sending getcfilter to 159.203.125.125:18333 (outbound)

Edit: Now ~20 minutes later synced_to_chain does show true without any intervention on my part.

It happens on my MAC as well. The first run just went well. Then when I restart the nodes with btcd on simnet, all now says "synced_to_chain": false. I CANNOT open the channels now:

[lncli] rpc error: code = Code(102) desc = Synchronizing blockchain

Anyone know how to overcome this issue?

@moneta are you using the _latest_ versions of btcd and lnd? A recent protocol change has been made, so you may be trying to speak to a node that doesn't understand the same protocol.

Thank you roasbeef. Today I updated both the lnd and btcd with the latest update. Then I started both btcd and lnd. Both are up normally. Then when I use the lncli to unlock the wallet, it throws these messages:

lnd log:

lnd --rpclisten=localhost:10001 --listen=localhost:10011 --restlisten=localhost:8001

2018-04-21 15:07:45.976 [INF] LTND: Version 0.4.1-beta commit=435e68a4383d44a31e32d302c253c075f0fb525f
2018-04-21 15:07:45.976 [INF] LTND: Active chain: Bitcoin (network=simmnet)
2018-04-21 15:07:45.977 [INF] CHDB: Checking for schema update: latest_version=0, db_version=0
2018-04-21 15:07:46.011 [INF] RPCS: password RPC server listening on 127.0.0.1:10001
2018-04-21 15:07:46.012 [INF] RPCS: password gRPC proxy started at 127.0.0.1:8001
2018-04-21 15:07:46.012 [INF] LTND: Waiting for wallet encryption password. Use lncli create to create wallet, or lncli unlock to unlock already created wallet.
2018-04-21 15:07:53.420 [INF] LNWL: Opened wallet
2018-04-21 15:07:53.421 [INF] LTND: Primary chain is set to: bitcoin
2018-04-21 15:07:54.459 [INF] LNWL: Opened wallet
2018-04-21 15:07:55.558 [INF] LNWL: The wallet has been unlocked without a time limit
2018-04-21 15:07:55.558 [INF] LTND: LightningWallet opened
2018-04-21 15:07:55.564 [INF] HSWC: Restoring in-memory circuit state from disk
2018-04-21 15:07:55.564 [INF] HSWC: Payment circuits loaded: num_pending=0, num_open=0
2018-04-21 15:07:55.565 [INF] HSWC: Trimming open circuits for chan_id=727:1:0, start_htlc_id=1
2018-04-21 15:07:55.569 [INF] RPCS: RPC server listening on 127.0.0.1:10001
2018-04-21 15:07:55.570 [INF] RPCS: gRPC proxy started at 127.0.0.1:8001
2018-04-21 15:07:55.574 [WRN] LNWL: Unable to synchronize wallet to chain: -1: Block number out of range
2018-04-21 15:07:55.617 [INF] HSWC: Starting HTLC Switch
2018-04-21 15:07:55.617 [INF] NTFN: New block epoch subscription
2018-04-21 15:07:55.618 [INF] NTFN: New block epoch subscription
2018-04-21 15:07:55.619 [INF] UTXN: Processing outputs from missed blocks. Starting with blockHeight=732, to current blockHeight=726
2018-04-21 15:07:55.619 [INF] UTXN: UTXO Nursery is now fully synced
2018-04-21 15:07:55.620 [INF] CNCT: Creating ChannelArbitrators for 1 active channels
2018-04-21 15:07:55.620 [INF] NTFN: New block epoch subscription
2018-04-21 15:07:55.620 [INF] NTFN: New spend subscription: utxo=9867b7eab7f22791d5bfc5e6685386788f5aa750053afc090d2285d82bb46155:0
2018-04-21 15:07:55.621 [ERR] SRVR: unable to start server: -1: Block number out of range

-1: Block number out of range

btcd log:

btcd --txindex --simnet --rpcuser=kek --rpcpass=kek

2018-04-21 15:07:32.577 [INF] BTCD: Version 0.12.0-beta
2018-04-21 15:07:32.578 [INF] BTCD: Loading block database from '/Users/Vietlc/Library/Application Support/Btcd/data/simnet/blocks_ffldb'
2018-04-21 15:07:32.587 [INF] BTCD: Block database loaded
2018-04-21 15:07:32.595 [INF] INDX: Transaction index is enabled
2018-04-21 15:07:32.595 [INF] INDX: cf index is enabled
2018-04-21 15:07:32.597 [INF] CHAN: Loading block index. This might take a while...
2018-04-21 15:07:32.630 [INF] CHAN: Chain state (height 726, hash 48e1400dc56183c8dd2187016787b32018a6fbd779af1c309e4c4085104231fe, totaltx 730, work 1454)
2018-04-21 15:07:32.647 [INF] RPCS: RPC server listening on [::1]:18556
2018-04-21 15:07:32.647 [INF] RPCS: RPC server listening on 127.0.0.1:18556
2018-04-21 15:07:32.649 [INF] AMGR: Loaded 0 addresses from file '/Users/Vietlc/Library/Application Support/Btcd/data/simnet/peers.json'
2018-04-21 15:07:32.649 [INF] CMGR: Server listening on [::]:18555
2018-04-21 15:07:32.649 [INF] CMGR: Server listening on 0.0.0.0:18555
2018-04-21 15:07:54.502 [INF] RPCS: New websocket client [::1]:56175
2018-04-21 15:07:55.613 [INF] RPCS: New websocket client [::1]:56178
2018-04-21 15:07:55.647 [ERR] RPCS: Websocket receive error from [::1]:56175: websocket: close 1006 unexpected EOF
2018-04-21 15:07:55.647 [ERR] RPCS: Websocket receive error from [::1]:56178: websocket: close 1006 unexpected EOF
2018-04-21 15:07:55.647 [INF] RPCS: Disconnected websocket client [::1]:56175
2018-04-21 15:07:55.647 [INF] RPCS: Disconnected websocket client [::1]:56178

It seems like after the first run (all worked well with the first run), the previous data (such as pending channels, connections between users) breaks the daemons's processes

Doesn't look like your btcd is fully synced?

I am getting the same "-1 Block number out of range" when I follow the tutorial, going through the startup of processes a second time. I am not recreating user directories the second time and I am using (I think) the wallet created the first time. The tutorial is https://dev.lightning.community/tutorial/01-lncli/index.html

Doesn't look like your btcd is fully synced?

If I use simnet, is that still potentially the problem?

To expand:

  1. I am using simnet and btcd seems to be synched immediately
  2. I then startup lnd for Alice
  3. When I unlock the wallet using lncli (for Alice) her lnd emits this message.

As mentioned, everything went okay the first time I tried things but after stopping all the processes the above behavior is seen.

@jrmillerwork Also on simnet you will need to mine some blocks before it is considered synced.

I got similar unsync state with missing data from btcd side

somehow there are no cfheader for one block and we receive _warning_

2018-06-05 15:07:49.473 [WRN] PEER: Could not obtain CF hash for 000000002e975383169ddb99c61aa1d159a074cba81cb62d829664f856a30a3d

As we request range of blocks by getcfheaders, by missing one cfheader we exit this function and retrieve nothing by request (usually it is maximum of requested - 2000 headers) so they will not be synced in neutrino
https://github.com/btcsuite/btcd/blob/86fed781132ac890ee03e906e4ecd5d6fa180c64/server.go#L876-L895

schedule processing of cfheader on the next incoming message
https://github.com/lightninglabs/neutrino/blob/ea85047f01c7ba80d11538ede6d54e5eb9b2f91b/blockmanager.go#L1357-L1361

handleProcessCFHeadersMsg writes sequence of hashes to ext_filter_headers.bin or reg_filter_headers.bin, some set of header was skipped before, then we get next ones and enter to this section
https://github.com/lightninglabs/neutrino/blob/ea85047f01c7ba80d11538ede6d54e5eb9b2f91b/blockmanager.go#L1452-L1465
tries to read them from file and get EOF error and exit method, stop writing to cfheaders to file and sending notifications to subscribers (btcwallet for instance)

@vapopov: do you mean this is the issue with btcd? Is there any available fix or workaround for now? Thanks

@moneta could be, I would probably add workaround if we get the same issue again (re-generate cfheaders if we have no available in btcd side)

@sangaman you could try with this changes

I've written btcctl command to traverse trough blocks and return error in case of not valid status
https://github.com/LightningPeach/btcd/tree/validate_block

~/g/s/g/b/btcd โฏ btcctl processheight 999007 20                                                                                                                                            
-32603: block 0000000000047aa9e4626f7411f64c1b05fb58d82f421ba1f8e88bba8750ccbe is known to be invalid, status: statusDataStored|statusValidateFailed

@vapopov unfortunately I'm not running any nodes running with neutrino currently so I haven't witnessed this issue recently. If I have time I'll try syncing one up again and see if I can still reproduce and if your PR helps.

I have similar issue, my node (connected to bitcoind 0.16.1) randomly switch to synced_to_chain=false, even when I see the correct tip in lnd console. Tested on 0.4.2 release as well as current git master.

getinfo
{
"num_pending_channels": 0,
"num_active_channels": 3,
"num_peers": 5,
"block_height": 529366,
"block_hash": "00000000000000000018dfe2792de961e9575b6c8ae85c3cfe192fc2c6f623fc",
"synced_to_chain": false,
"testnet": false,
"chains": [
"bitcoin"
],
"best_header_timestamp": "1530038427",
"version": "0.4.2-beta commit=b8fecfca71507643e2ba3cc32b65c9de87790597"
}

Seems like the state is changed when block arrives, although "block_hash" is updated accordingly.

We're aware of the issues with the bitcoind backend, see this PR for our fix: https://github.com/btcsuite/btcwallet/pull/511

Is your bitcoind node local or remote @slush0?

Local node with single zmq client. Thanks for update, I'll wait until merge.

Was this page helpful?
0 / 5 - 0 ratings