Prerequisite: https://github.com/ExchangeUnion/xud/issues/573
This issue is about answering the question: What should we do if we xud identifies via #573 that it is not properly connected to LND?
Current opinions:
a) disconnect from all peers
Good: easy
Bad: doesn't differentiate between e.g. LND and Raiden
b) Throw an error through gRPC and revoke all trading pairs with swap protocol LND. Also disable all peerOrders with swap protocol LND. Same with Raiden.
Good: considers different swap clients (LND, Raiden)
Bad: Complicated
We shouldn't disconnect from peers, as losing connectivity to lnd only means that one currency is impacted, so I think the better route is to temporarily disable all affected trading pairs. That would update the handshake with our peers to let them know we aren't supporting that pair (which in turn would make them drop all our orders for that pair) and would fail any rpc calls to place an order (or execute a swap) for that trading pair. We would also want to drop orders for that trading pair locally.
Other opinions? @offerm @erkarl @moshababo ?
xud start when lnd or raiden fails to connectI verified that we actually do connect to peers on startup even without a connection to lnd, however we are advertising the trading pair that we don't have an lnd connection for and that needs to change.
how to denounce a pair / protocol
how to reannounce a pari / protocol
We already have the ability to add and remove trading pairs (see addPair and removePair in OrderBook.ts`) so we just need to call these methods, this part should be simple actually.
Great, anything that still needs to be defined?
Good to go? @sangaman
Dependency: https://github.com/ExchangeUnion/xud/pull/762
Depends on https://github.com/ExchangeUnion/xud/issues/884#issuecomment-482553162
We have logic that more or less covers b) I think. What did you have in mind for a gRPC error?
Right now when we encounter any lnd error (including an unexpected end to our invoice subscription) we will mark that lnd as disconnected and reject any swaps involving that currency as well as stop ourselves from requesting any swaps involving that currency. That logic hasn't been tested very much I don't think.
We're not fully revoking the affected trading pairs in that we don't tell peers to drop all their orders for those pairs, which I think is good because we wouldn't want that to happen if we're able to reconnect to lnd within a few seconds. Maybe we should set a timer after which point if we haven't made a successful call to lnd, we tell peers that we're drop support for it, then whenever it comes back online we tell them to reenable.
Maybe we should set a timer after which point if we haven't made a successful call to lnd, we tell peers that we're drop support for it, then whenever it comes back online we tell them to reenable.
Think this approach seems reasonable.
Great that we have most of it! As far as I understand we still have to handle:
OwnOrders with peers, triggered by an disconnected swap client (lnd/raiden). Agree to only do that after a certain timeout. Local environment issues can cause a couple of seconds of disconnection which shouldn't trigger this. Is 30s a reasonable timeout?I would use the regular OrderInvalidation package to invalidate all open OwnOrders of an affected pair, nothing new to tell peers to disable per swap client. When the swap client is available again, we simply broadcast all orders again. I know that's rather inefficient, but I believe events where swap clients get disconnected are rather exceptions and I wouldn't want to have another "disable all orders for this swap client" be handled by the peer. This will cause new issues when reenabling with oudated orders etc. I'd prefer the local client selecting all order to be invalidated and send the corresponding packet to the peer. We can think of making the OrderInvalidation packet multi-order capable to make this process more efficient.
Added testing this to the TODO's for xud-simulation tests:
https://github.com/ExchangeUnion/xud/blob/simulation-readme/test/simulation/README.md
SubscribeAlerts discussion: https://github.com/ExchangeUnion/xud/issues/864#issuecomment-482296578)Invalidation of affected OwnOrders with peers, triggered by an disconnected swap client (lnd/raiden). Agree to only do that after a certain timeout. Local environment issues can cause a couple of seconds of disconnection which shouldn't trigger this. Is 30s a reasonable timeout?
I was thinking 30s or a minute, either is fine with me. We already have the logic in place to invalidate all own orders for an affected pair, so that's no big deal.
Broadcasting all orders again is a new feature though. This would require us to track invalidated own orders separately within xud (currently we just get rid of them). It also means that we could have a lot of re-validated orders match immediately - would the re-validated orders become the takers in that case? It's certainly doable, but I'm thinking that it might just be better to expect exchanges to resend their orders to xud when a trading pair comes back online.
Broadcasting all orders again is a new feature though. This would require us to track invalidated own orders separately within xud (currently we just get rid of them)
Yes, I believe we need to take the orders out of the order book and store them somewhere else.
It also means that we could have a lot of re-validated orders match immediately - would the re-validated orders become the takers in that case? It's certainly doable, but I'm thinking that it might just be better to expect exchanges to resend their orders to xud when a trading pair comes back online.
Exchanges in no-matching mode track orders on their own. Any other node in matching mode would permanently loose all these orders - not good. Also, how would the exchange re-sending orders to xud and xud in turn immediately sending it to the order book be any different from xud doing that directly? Exchanges would have to implement a separate logic to handle these cases. Let's do it for them!
would the re-validated orders become the takers in that case?
Good point, this has to be avoided. The taker/maker side should never change before and after the swap client disconnect event. As a matter of fact, all own orders residing in the order book are maker orders and that's how the situation should be after the disconnect event.
An idea how to do this:
Swap client LND disconnected. Caching own orders for trading pairs X, Y, Z and temporarily invalidating these on the network until swap client LND is back.OwnOrders out of order book, store them somewhere else in the same order they were in the order book priority queue. Drop PeerOrders for the pair in the same step. This has to be "blocking" and from start of the procedure we don't accept and attempt any swaps for that pair.Successfully cached & invalidated own orders for trading pairs X, Y, Z on the network. Waiting for swap client LND.Swap client LND disconnected error for PlaceOrder on pairs X, Y, Z in the meanwhile. Drop incoming peer order for pairs X, Y, Z.Swap client LND back, reinserting own orders for trading pairs X, Y, Z into order book.Successfully reinserted own orders for trading pairs X, Y, Z into order book.OK, I think that can be done. But the way you described it there could be orders that overlap in price but don't "match" because they are both treated as makers, but I guess that's the desired outcome in this case?
All own orders in the order book are maker orders. So that's what these were before and that's what they should be after swap client disconnect.
there could be orders that overlap in price but don't "match" because they are both treated as makers
don't see how this can happen, then they should have matched before already. Note: as I described it, we don't allow any old or new peer orders or new own orders (place orders) while swap client is disconnected.
but I guess that's the desired outcome in this case?
Anyways, yes that's the desired outcome. Still would be interested in how a price overlap of two own orders which were like this in the order book before can occur.
don't see how this can happen, then they should have matched before already. Note: as I described it, we don't allow any old or new peer orders or new own orders (place orders) while swap client is disconnected.
When we request the orders from peers that we missed while we weren't accepting orders, we could get orders that match the orders we put aside. But they will both be "maker" orders in this case.
But they will both be "maker" orders in this case.
Now I understand what you mean. Yes there can be a price match but there won't be a match/swap triggered, because OwnOrder maker orders are already in the order book before we start accept peer orders again. So all good I guess?
Yup I think so.
Dependent on @erkarl 's SubscribeAlerts call to be available: https://github.com/ExchangeUnion/xud/issues/864#issuecomment-482296578
Ensuring lnd and raiden are up can be done by docker, above dependency SubscribeAlerts moved to post-1.0.0, this one should be one in conjunction with https://github.com/ExchangeUnion/xud/issues/891 in post-1.0.0.