One of my main problems with MeshCentral right now is that when you use it behind a reverse proxy, agents are unable to connect to the server unless MeshCentral's webserver certificate is identical to the one you're using in your reverse proxy setup. This is true regardless of whether you use the --notls flag (which shouldn't be using any certificates in the first place) or the --tlsoffload flag (which should have the option to disable it).
The primary downside of this is that you need to manually copy the certificate from your reverse proxy to your MeshCentral host every time it expires which can be bothersome when you use certificates with a short lifetime like the ones from LetsEncrypt or when the reverse proxy is running on another host. Since MeshCentral currently doesn't support ECDSA certificates, it also limits the type of certificate you can use in your reverse proxy. This is mainly a problem when you use a wildcard certificate instead of individual certificates for each of your self-hosted applications since it forces you to use an RSA certificate for all of them.
My experience with virtually every other self-hosted web applications I've used has been that the connection between the reverse proxy and the application is using HTTP by default and that the use of HTTPS has to be explicitly enabled in the options menu with manually supplied certificates.
I don't think you necessarily need to change MeshCentral to use HTTP by default, but I do think that there should be an option to disable certificate validation in the config file so that people can simplify their certificate management process. Those who want the additional security of a HTTPS connection between the reverse proxy and the application would not have to change their setup this way.
Edit: Minor phrasing changes.
Hi. I am traveling now, but will be back on the job next week. The problem with removing certificate validation is that it leaves the MeshAgent-MeshServer connection vulnerable to man-in-the-middle attacks. So, I really don't want to remove it. However, one thing I can do is have MeshCentral load the certificate from a HTTPS URL. That way, you run MeshCentral and it will get the cert automatically. This assumes that no-one tampers with the way the certificate loading is done between MeshCentral and the reverse proxy, but it seems like a low risk. I can also periodically check for a certificate update (every 5 minutes or so, make it configurable). That way, you set and update your reverse proxy cert and don't worry about MeshCentral + get good security.
Let me know what you think of this idea and if it would work for you.
@Ylianst That sounds great to me!
Made some progress on this. In MeshCentral v0.2.2-o, you can add "certurl" in the domain section of the configuration file with a HTTPS URL where the public certificate of the web server is. MeshCentral will load it on start and use the hash of that cert to check agent connections. No support for ECDSA certs yet, but working on that and still need to test --tlsoffload. Also, it will not poll this right now, so if you change the web certificate, you need to restart MeshCentral.
{
"settings": {
"Port": 443
},
"domains": {
"": {
"title": "MyServer",
"certUrl": "https://192.168.2.106:443/"
},
}
@Ylianst To test this new option, I updated MeshCentral, removed the existing valid webserver certificate and waited for MeshCentral to regenerate its own certificate after a restart. Afterwards, I looked through the logs to see what error message the Windows Agent that automatically connects to my server would cause. I restarted MeshCentral a few times to ensure that the results are consistent:
meshcentral | Agent connected with bad web certificate hash (e1d00cec64 != 54e3a199d5), holding connection (172.30.0.1).
meshcentral | Agent connected with bad web certificate hash (e1d00cec64 != 54e3a199d5), holding connection (172.30.0.1).
meshcentral | Agent connected with bad web certificate hash (e1d00cec64 != 54e3a199d5), holding connection (172.30.0.1).
meshcentral | Agent connected with bad web certificate hash (e1d00cec64 != 54e3a199d5), holding connection (172.30.0.1).
This was to be expected since the certificates no longer match up. Next, I edited my config.json file to include the certUrl option with the domain of my site and restarted MeshCentral. The weird thing is that the error message not only kept showing up in the logs, but this time, the hash also changed with each restart:
meshcentral | Agent connected with bad web certificate hash (ceb48787d6 != 54e3a199d5), holding connection (172.30.0.1).
meshcentral | Agent connected with bad web certificate hash (82913b9dc2 != 54e3a199d5), holding connection (172.30.0.1).
meshcentral | Agent connected with bad web certificate hash (6b48c43f72 != 54e3a199d5), holding connection (172.30.0.1).
meshcentral | Agent connected with bad web certificate hash (0a80bf3c13 != 54e3a199d5), holding connection (172.30.0.1).
I'm not sure what to make of this.
The hash changes each time you restart the agent I presume... oh that is super weird. I will look at it tomorrow. Also, if certUrl points to a ECDSA cert, I expect the hash match to fail because the MeshAgent and MeshCentral will compute the hash differently. I will fix that in the next few days.
Yesterday I ran NGNX for the first time in front of MeshCentral and saw exactly the problems you reported. The --tlsoffload does not allow user login, etc. I am going to work on that tomorrow and get a fix for it. I will also add support for the NGNX headers that indicate the protocol and IP address of the connection, etc. So reverse proxies will be properly supported and will add documentation for it.
@Ylianst No, sorry, I should have been more clear. The hash changes each time I restart the MeshCentral server component. I didn't touch the agent at all. My certUrl currently points to a domain with an RSA certificate so I presume it should be working fine.
I'm happy that you are able to fix the issue with the --tlsoffload flag!
Ha! Thanks for the clarification. That will help a lot figuring out what is going on.
Published MeshCentral v0.2.2-p on NPM with reverse-proxy support and updated the MeshCentral user's guide 1.7 with a new section 15 that shows a full example on how to setup MeshCentral with NGINX. It seems to work correctly on my computer, but would appreciate feedback on it. I have not seen the problem with loading the web certificate using "CertUrl", but I display certificate loading information at server start that will help debug that. ECDSA certs not supported yet.
@Ylianst Awesome, I just gave it a try and I can confirm that it's now possible to log in when using the --tlsoffload flag with MeshCentral behind a reverse proxy. I have subsequently closed issue #29.
I have also enabled the certUrl option in my config.json file again with v0.2.2-p this time and tried out the following notations:
Below you will find the corresponding log results for each of them:
Failed to load web certificate at: my.domain.tld
Failed to load web certificate at: my.domain.tld:443
meshcentral | Loaded non-RSA web certificate at https://my.domain.tld, SHA384: eeacc5e602bdec6ee473744f8e4ef9ea02c4f935226506ebb4d33435b2f85c9b8b9ddbd3aa07cd46dde0c8faf2797c73.
meshcentral | Loaded non-RSA web certificate at https:/my.domain.tld:443, SHA384: c71cd0907a8eaeae82cc3ee5c7ac5525abd608584abcab528d85796ff5ac2ce82ada80af39c2b1f2eb4efee1ef359a33.
Afterwards, I tried out the last two notations repeatedly and saw the same result as before: a changed hash after every restart of the MeshCentral server component.
meshcentral | Loaded non-RSA web certificate at https://my.domain.tld, SHA384: 1bbec5b5411f89d9d3449b99c0556b5567135d2d0d86b2d945fd9814cd8f0f71972d1617250a99892882dd122b71d90e.
meshcentral | Loaded non-RSA web certificate at https://my.domain.tld, SHA384: 2ce8f25d77227091090f724472bd337f7c66540be541618bbf5397f1d507504ab8357566d6bae12df878c467914f0a6d.
meshcentral | Loaded non-RSA web certificate at https://my.domain.tld, SHA384: 3a5d008c915d25f2d0006598cf2a5b48e0a9cd96f81195cc470a36cea757cfc75f515b50e2c8b66511c83485eda6cfa6.
meshcentral | Loaded non-RSA web certificate at https://my.domain.tld:443, SHA384: 2ce8f25d77227091090f724472bd337f7c66540be541618bbf5397f1d507504ab8357566d6bae12df878c467914f0a6d.
meshcentral | Loaded non-RSA web certificate at https://my.domain.tld:443, SHA384: 6b48c43f72f57ca93ca9e9b296766cc6ab60057ab8aeb14e733222122c5bc750b6b990bc491e874e77aaabe743dc81c5.
meshcentral | Loaded non-RSA web certificate at https://my.domain.tld:443, SHA384: c90d46305e6575c1c1b672756990d90a7db2098a4da236fd06b3994f447ac0f9d8304637ffe6907ccff8b561c8536c98.
Based on the fact that there is a limited amount of hashes that it seems to cycle through (there is one repetition above), what I assume is happening here is that MeshCentral is fetching certificates from my other subdomains, all of which use ECDSA certificates. I can't imagine how or why it would be doing that, though.
Edit: Minor formatting change.
I fixed the "CertUrl" field to allow the "my.domain.tld:443" notation in new versions of MeshCentral2. Just added a YouTube video on how to setup NGINX and MeshCentral: https://www.youtube.com/watch?v=ebDVAsistbk. However, I am not sure I can fix the rotating certificate problem. You can try going to https://my.domain.tld:443 with a browser and check the certificate that is received by the browser? Seems like on each reload, you will get a different one? Seems NGINX is using different certificates for the same name? No sure what do to since this does not seems to be a MeshCentral problem.
@Ylianst While the cert stays the same when I visit the page in my browser, I did some other tests to confirm that MeshCentral is not the cause of this issue. It seems to be the fault of the reverse proxy I'm using, so I'll open a bug report there and close this issue.
After a bit more testing, it turns out that the reason why the fetched certificate changes with each request is because MeshCentral doesn't appear to be using SNI when fetching the certificate. The documentation for Node's TLS module has some information about how to use SNI.
@Ylianst Could you have another look at the issue in light of these findings?
Edit: Added information about SNI in Node.
MeshCentral does use SNI when fetching the certificate. If you put an DNS name in "certUrl", that name will be in the TLS Client Hello as a server name indication. If you put an IP address in "certUrl", SNI will not be added to Client Hello. I just checked with WireShark, looks like it works perfectly. Not sure what I can do next.
@Ylianst That's pretty strange. I should tell you about the tests I conducted so we can get to the bottom of this issue. When I originally came to the conclusion that MeshCentral isn't the cause of the changing certificates, it was after I had tried to use OpenSSL to fetch the certificate from my MeshCentral domain. The certificate changed each time I used OpenSSL, so I thought the reverse proxy must be at fault.
The command I used was this one:
openssl s_client -connect my.domain.tld:443 2>&1 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p'
After opening an issue on the GitHub page of the reverse proxy and doing more research on how to use OpenSSL, I found out that the reason why the certificate changed each time I ran OpenSSL was because the reverse proxy is programmed to present the client with a random certificate when the SNI is missing from the request. Once I adjusted the command to include the SNI, OpenSSL managed to fetch the correct certificate every time.
The adjusted command I used was this one:
openssl s_client -servername my.domain.tld -connect my.domain.tld:443 2>&1 | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p'
The author of the reverse proxy has since changed the behavior of the reverse proxy to raise a TLS alert instead of presenting the client with a random certificate when the SNI is missing. As a result, the first OpenSSL command now returns no certificate and MeshCentral is now likewise unable to fetch any certificate from my MeshCentral domain. The second OpenSSL command still works flawlessly.
Since the TLS module that MeshCentral is using is built on top of OpenSSL and exhibiting the same behavior as the first OpenSSL command, I concluded that MeshCentral must not be using SNI right now.
A few days ago, I also had a look at the commit which introduced the certificate loading feature to MeshCentral and one line of code stood out to me. I'm not a programmer, so I could be completely on the wrong track here, but when I was reading the documentation for Node's TLS module, I noticed that just like OpenSSL, it uses the "servername" parameter to specify the SNI. In the following line of code, only the hostname is specified while the "servername" parameter is missing.
var tlssocket = obj.tls.connect((u.port ? u.port : 443), u.hostname, { rejectUnauthorized: false }, function () { this.xxcert = this.getPeerCertificate(); this.end(); });
Could this missing parameter possibly be the cause of this issue?
Edit: Minor corrections.
You are exactly right. The "u.hostname" is the SNI parameter on that obj.tls.connect() line. You can add the line "console.log('SNI: ' + u.hostname);" before the connect line to make MeshCentral display the SNI it will be using. From my experience, this works well. If an IP address is specified, then no SNI is sent.
@Ylianst I'm all out of ideas on what could be causing this issue then. I guess you could try implementing the same feature in a different way to see if that would change anything.
Ha! Well... you where right. SNI was not correct and just fixed it in MeshCentral v0.2.3-u. Now, "certurl" will load the certificate correctly. However, tunnels from the mesh agent will not work because they don't have SNI yet. Working on a fix for that now.
Posted MeshCentral v0.2.3-w that fixes all of the TLS-SNI problems for reverse proxies. Please try it and report back.
@Ylianst I can confirm that MeshCentral can fetch the certificate with no issues now. Well done!