I've had the following pattern happen a lot:
I think my first issue here was caused by opening a channel before lightningd synced to the last blockheight.
Maybe we could prevent lightning-cli to run certain commands if the blockheight isn't recent? Aren't blocks supposed to have a timestamp which can be compared to the system one? Or maybe prevent lightningd to to fundchannels, close or pay if a block has been found in the last 5 seconds?
I agree. There are two cases:
We can easily spot the former, but the latter is harder. It's also not entirely clear what should be disallowed / delayed if we're behind. Usually things "Just Work".
related issues: #1652 #808
bitcoind's not caught up with the network.
Comparing the output of getblockchaininfo we can reduce the chance of thinking we are caught up when we really aren't: the headers field counts how many bitcoin block header bitcoind has, while the blocks field show how many blocks we actually have processed. So if the two match bitcoind is either caught up, or we are in a tiny time window in which we have just started and not yet synced the headers, which usually lasts only a very shortly.
We might also combine this with #2697, since all of these are querying bitcoind before starting the RPC interface.
@cdecker #1074 also?
bitcoind's not caught up with the network.
Might it not be useful to be able to query Lightning-node peers about the blockheight they believe is the latest?
It's also not entirely clear what should be disallowed / delayed if we're behind. Usually things "Just Work".
Things that require agreement with some Lightning node are the ones that should be delayed:
sendpay, because of the CLTV computations (the destination will reject payments if it sees its CLTV is less than current blockheight + its final_cltv). We do not need to delay pay, we just need to delay sendpay, but pay might want to also defer its getroute until when sendpay is safe (so that it gets the latest routemap information: some of the blocks we have not seen yet may include a close of a channel).update_fulfill_htlcs. U will then attempt to update_htlc_fulfill to B, but B delays while A completes the revoke_and_ack until the HTLC getting fulfilled is irrevocably committed. Then B drops the U<->B channel onchain and uses the timelock branch (which is in the past at this point, but was accepted by U because it was behind the blockchain). We should delay sending an outgoing HTLC from a forwarding attempt until we have caught up to the chain (and fail the incoming HTLC if the outgoing HTLC will have a locktime less than blockheight+1, to give us time to reclaim our outgoing HTLC), and also delay releasing the preimage if we are the payee.fundchannel (or maybe just fundchannel_start?), because height disagreements.Other commands do not seem to require agreement about the current blockheight. @thestick613 can you describe problem with close and blockheight disagreements?
sendpay, because of the CLTV computations (the destination will reject payments if it sees its CLTV is less than current blockheight + itsfinal_cltv). We do not need to delaypay, we just need to delaysendpay, butpaymight want to also defer itsgetrouteuntil whensendpayis safe (so that it gets the latest routemap information: some of the blocks we have not seen yet may include a close of a channel).
This could also be "fixed" by bumping the final CLTV up to what we think the current blockheight is, rather than using the blockheight we are currently synced to. So in case of us checking the headers in getblockchaininfo we could still send, despite lagging a bit behind. I can't think of a case where this could harm us, since our peer would just refuse if something happened to our channel, and worst case is that even our estimate is wrong and then the final node will continue to reject due to the CLTV delta being too small.
- Incoming HTLCs (both forwards and those that terminate at our node) should be delayed. This is a potential point of attack especially for forwards. Suppose we have a node U that is behind, and two attack nodes A and B. B routes B->U->A, with HTLCs that have locktimes in the past, but still in the future of the node U. Once B<->U and U<->A have HTLCs irrevocably committed, A then
update_fulfill_htlcs. U will then attempt toupdate_htlc_fulfillto B, but B delays while A completes therevoke_and_ackuntil the HTLC getting fulfilled is irrevocably committed. Then B drops the U<->B channel onchain and uses the timelock branch (which is in the past at this point, but was accepted by U because it was behind the blockchain). We should delay sending an outgoing HTLC from a forwarding attempt until we have caught up to the chain (and fail the incoming HTLC if the outgoing HTLC will have a locktime less than blockheight+1, to give us time to reclaim our outgoing HTLC), and also delay releasing the preimage if we are the payee.
Definitely agree, a simpler attack would basically be that B closed the BU channel, but we haven't seen it yet, so we'll happily forward payments through UA, and lose money (this is also the main issue that Stepan sees with hardware wallets auto-signing forwards).
fundchannel(or maybe justfundchannel_start?), because height disagreements.
Could you elaborate on this? I'm not sure how fundchannel or fundchannel_start would be impacted, other than us deciding at some point that we've seen 100s of blocks since we thought our peer funded, and basically giving up on them, forgetting about the channel.
bitcoind's not caught up with the network.
Comparing the output of
getblockchaininfowe can reduce the chance of thinking we are caught up when we really aren't
What about the bool field initialblockdownload in getblockchaininfo? It can be more directly :)
Could you elaborate on this? I'm not sure how
fundchannelorfundchannel_startwould be impacted, other than us deciding at some point that we've seen 100s of blocks since we thought our peer funded, and basically giving up on them, forgetting about the channel.
Precisely that. We immediately forget after the forget depth, and never send funding_locked; this means the channel is not broadcast, so it is not used for forwards; and the other side might take a long time before it decides to make a payment that uses the channel. So both peers might remain connected for a good amount of time without the funding node mentioning the channel that has been forgotten by the fundee. The funding node never learns that the fundee has forgotten the channel, unless they get disconnected and the funder attempts to reestablish the channel.
Most helpful comment
I agree. There are two cases:
We can easily spot the former, but the latter is harder. It's also not entirely clear what should be disallowed / delayed if we're behind. Usually things "Just Work".