Xud: xucli connect to a peer which banned me shows success (but connect failed)

Created on 6 Dec 2018  路  12Comments  路  Source: ExchangeUnion/xud

vagrant@xud-simnet-1:~$ xucli connect 030b20e28e42b95677587c7b2d4a7f2aeaf886491ef70648191ce57d55012e4b58@192.168.33.11:8885
success

vagrant@xud-simnet-2:~$ xucli connect 037c514ee25ebbfc10726442325bdf3210b6ee80a9b3618aa96788a06b01696535@192.168.33.10:8885
Error: 9 FAILED_PRECONDITION: could not connect to node 037c514ee25ebbfc10726442325bdf3210b6ee80a9b3618aa96788a06b01696535 because it is banned

P2 bug p2p

Most helpful comment

@rsercano it looks to me that if this line will throw, we're not ignoring it but failing Peer.open altogether, and that should propagate to Service.connect.

The case here is probably that the closing of connection happens after Peer.open finished successfully, because it isn't synchronous from simnet-1 perspective. A way to fix this is to terminate the connection in the handshake phase, which isn't what we're doing in the moment.
Right now the handshake process is non-interactive, so it couldn't be done, but I changed it (not merged yet) to be interactive (connection initiator begins, and awaits response), so we can hook into that if we want to apply validations in the inbound peer side, so that they will fail the handshake.

I'm not fully sure it's necessary, because it's a feature which is relying on the peer behavior, and if he decided to close the connection not in the handshake but 0.1 sec after, that doesn't mean we have a bug, only that the synchronous response to connect isn't reliable without further async updates.

So other solution is to make connect a stream method, which will subscribe to future updates on the specific peer, if the initial connection succeeded.

@kilrau

All 12 comments

simnet-2 banned simnet-1, but connection attempt from 1 still shows "success". It should show an error message.

This seems to be relevant with ignoring promise rejection here: https://github.com/ExchangeUnion/xud/blob/57c48de3fbbacc564186e132afe05d4f333c43fc/lib/p2p/Peer.ts#L179

During startup it's okay to ignore rejection but I guess if connect comes from cli we need to throw error.

@moshababo Could you please confirm that ?

@rsercano it looks to me that if this line will throw, we're not ignoring it but failing Peer.open altogether, and that should propagate to Service.connect.

The case here is probably that the closing of connection happens after Peer.open finished successfully, because it isn't synchronous from simnet-1 perspective. A way to fix this is to terminate the connection in the handshake phase, which isn't what we're doing in the moment.
Right now the handshake process is non-interactive, so it couldn't be done, but I changed it (not merged yet) to be interactive (connection initiator begins, and awaits response), so we can hook into that if we want to apply validations in the inbound peer side, so that they will fail the handshake.

I'm not fully sure it's necessary, because it's a feature which is relying on the peer behavior, and if he decided to close the connection not in the handshake but 0.1 sec after, that doesn't mean we have a bug, only that the synchronous response to connect isn't reliable without further async updates.

So other solution is to make connect a stream method, which will subscribe to future updates on the specific peer, if the initial connection succeeded.

@kilrau

Yeah that line doesn't ignore rejection, await will reject on any exceptions/failures.

We return a successful response to the connect rpc call as soon as we get a valid Hello packet from the peer, but if the peer closes the connection right after that (maybe they don't like the Hello packet that we send them) then the connection will be closed immediately after. I believe this would happen, for instance, if we connect to a peer that has banned us.

I'm not sure how making the handshake process interactive would solve that. We could send a ping packet and wait to get a pong before we respond to the rpc call maybe?

Any approach which way to go? @rsercano

@sangaman
Peer.ackSession is called when performing handshake, and that's where we set Peer.nodeState, which has the information we need for knowing whether the peer is valid or not.

At the moment Peer.ackSession is just doing authentication validation and ECDH before sending SessionAck, but we can move all the peer validations into there (currently they are in Peer.open after handshake, and in Pool.handleOpen), so the default behavior would be that if a peer isn't happy about something, he will close the connection before sending SessionAck and handshake will fail.

Yes that makes sense, we should make all checks about the validity of the peer (including whether it's banned) before responding with SessionAck. @rsercano - does that make sense to you?

Do you also need help here? @reliveyy

I fount when peer1 banned peer2 then peer2 tring to connect to peer1 xucli connect will show success because responseObj is empty. But on peer2's logs there is one line

received disconnecting packet from 02ed4e79071f524bf9ceaab4f40e90f4716632dd59e995500b8232c81f93c2c536:{"reason":7}

And the reason 7 is DisconnectReason (enums.ts) Banned. But this response is asynchronous.

xucli connect will show success because responseObj is empty

That sounds like the problem. IMO peer 1 should show Connection status: unknown if response object is empty.

Peer 2 response object should never be empty so that peer 1 can display the connection status accordingly:
Success -> Peer 1: Connection status: success
Banned -> Peer 1: Connection status: banned
`-> Peer 1:Connection status: unknown`
etc.

What do you think @reliveyy @sangaman @moshababo ?

I think the Connect RPC call should not return a response until either the connection & handshake is successful or the connection fails, such that "unknown" is not necessary. This is actually already how it behaves, but we don't make the check if a node is banned until after the handshake is complete and the peer is "opened." That's why connecting to a banned node is treated as a success.

This is actually somewhat necessary because we don't know the peer's pub key for certain until after the handshake. We then have a handful of checks in Pool.handleOpen here including the banned check, and we mark the peer as active after the checks. We should not consider the connection process complete until this active flag is set.

I think the best approach is probably to create a peer.active event from the Pool that includes the pub key of the activated peer. The Service.connect method should then listen for this event for the specified node pub key before returning. It should also listen to the close event, and if it fails throw an error.

I can definitely do this myself now that I've looked through the code, but @reliveyy you can try it first and I can share more feedback later in a PR. Let me know.

I think the best approach is probably to create a peer.active event from the Pool that includes the pub key of the activated peer. The Service.connect method should then listen for this event for the specified node pub key before returning. It should also listen to the close event, and if it fails throw an error.

I can definitely do this myself now that I've looked through the code, but @reliveyy you can try it first and I can share more feedback later in a PR. Let me know.

Thanks for the detailed help @sangaman !!! Let @reliveyy have another shot, he's still getting familiar with the code base and this is not particularly urgent 馃憤

Was this page helpful?
0 / 5 - 0 ratings

Related issues

moshababo picture moshababo  路  6Comments

kilrau picture kilrau  路  6Comments

raladev picture raladev  路  4Comments

kilrau picture kilrau  路  3Comments

kilrau picture kilrau  路  4Comments