Lisk-sdk: SSL breaks Websocket connection

Created on 24 Jun 2018  路  15Comments  路  Source: LiskHQ/lisk-sdk

Expected behavior

With SSL enabled incoming websocket connections should work.

Actual behavior

With SSL enabled incoming Websocket connections fail.

Since no incoming Websocket connections are being established the broadhash consensus can only be established every 30 seconds (when peer.rpc.status is called) as no updateMyself broadcasts can reach the node. See #2083

The problems are the following:

SocketCluster enables the secure flag when https is enabled. Therefore incoming connections need to use wssas protocol. Currently the peer headers don't contain a flag signaling whether the node has secure websocket connections enabled which is why other peers can't know whether they have to connect via ws or wss => The connection fails with secure peers.

Peer connections are established using the IP address and not hostnames. Therefore the certificate validation fails when a secure WebSocket connection is being established. If cert checking should be disabled the rejectUnauthorized flag would have to be set. However this would essentially make the secure connection pointless.

Steps to reproduce

Launch a Lisk Core node with HTTPS enabled in the config.

Proposed solution

Disable HTTPs for WebSocket connections as certificate validation is not really possible without hostnames/domains and implement a better encryption mechanism post-1.0.0.

Completely remove the SSL options from Lisk Core. It just adds unnecessary complexity and is not as performant as using a reverse proxy like nginx to do SSL Termination. I'd rather recommend to add a guide on how to setup the Lisk core in conjunction with nginx/etc.

Which version(s) does this affect? (Environment, OS, etc...)

1.0.0+

elementP2P

Most helpful comment

@jondubois, @abelboldu and I discussed this. The way forward depends a little on when this would get resolved. At the moment, this issue is in the 1.1.0 milestone. In that case, it would make more sense to pull this out completely and use a proxy for ssl termination, since by that point, everyone will be doing that anyway in the interim (my personal choice). If this is to be resolved in 1.0 before mainnet launch, it makes more sense to keep it in and have it working for the api and not do ssl for p2p.

It's worth noting that people are using this feature as it was an issue for a number of people when they did the testnet migration. There are certainly delegates using this on mainnet and perhaps even some exchanges

All 15 comments

Good job filling the issue, it was pretty late yesterday when I finnaly found the root cause and fixed it on my servers. For sure it would be best to solve the problem in the core to keep that nice feature bundled but yes, in the end, nginx ssl offloading do the job perfectly (even better the lisk code)

For example, I can bring the SSL score to A+ with nginx when lisk node allow only to go up to B / B+.

About the scoring (for the record), I'm using this tool: https://www.ssllabs.com/ssltest/

I don't think it is worth having TLS between peers. Only calls to private endpoints (for monitoring, auto-forging switch, lisk-nano connections, etc) are important to stay encrypted.

One last note, I didn't tested self-signed certs. I used let's encrypt certificates for all my testing (certbot, same for the last 2 years)

Maybe we should have two separate SSL/TLS on/off config options for the public HTTP API and for the WebSocket P2P API?

A node might want to serve its public API over https:// but it might still want to use regular ws:// for the P2P layer (to be compatible with the rest of the network).

Supporting both ws:// and wss:// protocols within the same network would add a lot of complexity to the P2P layer.

I agree with @Gr33nDrag0n69 that it's probably not worth it for Lisk to have TLS between peers because all data that passes through the network is public (and signed) anyway.

For 1.0, we should not enable this TLS feature on the node itself; instead, a terminating TLS proxy should be run alongside the Lisk node as recommended above (if HTTPS support is necessary).

Rethinking about it, I can see one disadvantage of using nginx as ssl offloader. It's not that bad but could complicate the implementation a bit. It's the API and forging whitelisting. (including private endpoints) Since using nginx in front will always relay on 127.0.0.1, using nginx "bypass" the whitelist and public settings of API in config.json. The only work around I can think in this case is that who can access port 443 must be manage by firewall. (a bit more complicated then just configure it in config.json)

In simple terms, without control of who can access port 443 of the server (using firewall), nginx will unlock API calls at large even with strick whitelist policy of lisk-node.

If I consider this point:

that it's probably not worth it for Lisk to have TLS between peers because all data that passes through the network is public (and signed) anyway.

I am not convinced with above, either node serving with SSL or not at all. It should not be the case to serve one interface over SSL and other not.

I don't see any complexity either, each handshake should be tried over wss:// and fallback to ws://. Later that flag must be kept in the peers manager for further communication.

If we are recommending to have TLS proxy, then we should stick with this plan and remove SSL support from Lisk itself and recommend one way for community.

@Gr33nDrag0n69 Good point, I guess a potential workaround could be to use the X-Forwarded-For HTTP header from nginx to figure out the IP of the end-client?

@jondubois I guess that would solve the but I never tested it with lisk.
@nazarhussain I agree mid-term, one solution should be pushed for and the other ditched, but short term with new version of lisk, the TLS Proxy would be a great workaround to current issue with bundled TLS support.

I still don't see any advantage to encrypt communication between nodes. Also handshakes handling between nodes would be overhead if network is under stress.

@Gr33nDrag0n69 The forward HTTP Header works if you enable trust_proxy in the config.

I cannot recommend trustProxy, as it allows to use X-Forwarded-* from any source IP. Implementation should be changed to allow limiting that to specific IP(s) only.

@jondubois, @abelboldu and I discussed this. The way forward depends a little on when this would get resolved. At the moment, this issue is in the 1.1.0 milestone. In that case, it would make more sense to pull this out completely and use a proxy for ssl termination, since by that point, everyone will be doing that anyway in the interim (my personal choice). If this is to be resolved in 1.0 before mainnet launch, it makes more sense to keep it in and have it working for the api and not do ssl for p2p.

It's worth noting that people are using this feature as it was an issue for a number of people when they did the testnet migration. There are certainly delegates using this on mainnet and perhaps even some exchanges

Opening the issue for 1.0.0-rc.2

Was this page helpful?
0 / 5 - 0 ratings

Related issues

karek314 picture karek314  路  3Comments

slaweet picture slaweet  路  3Comments

willclarktech picture willclarktech  路  4Comments

Nazgolze picture Nazgolze  路  3Comments

willclarktech picture willclarktech  路  4Comments