Hi @krayon007 and @Ylianst,
I found an issue that seems to run the entire OSes spectrum. What is seen is an error in Console that states (less token key info will be **).
ERROR: Unable to connect relay tunnel to: wss://remote.servername.com:443/meshrelay.ashx?=2&nodeid=node//**, {"message":"TLS Handshake Error"}
This is based on server being on version: 0.7.54
No server error log details
In My server it has Agent Error Counters: (these ballooned up from last night where there were none...)
Invalid PKCS Signature 44, Invalid RSA signature 6, Bad signature 46.
Most of these systems working last night now show up as black screen on the sever page...
Thanks,
SomeGuru
Hi @Ylianst and @krayon007,
I found out what caused this and the blackscreen issue for my instance... The WU (Windows Update) causes on some patching for the system to disable key aspects of the system to sort of force the user to do the mandotory reboot of the system... Which has resolved the blackscreen and TLS Handshake Error.. I came upon this information by way of systems (50) pending the WU reboot required screen... Therefore I used MeshCentral to kick off the reset on the systems and get things back to normal... This is a good bit of information, because similarly on Ubuntu variants of Linux I have reproduced the steps yet again after a "Critical" security update had happened to the system...
How can your users prevent these types of things from happening?
I hope this can help some of your folks seeing this issue...
SomeGuru
Allowing others to see this issue before it mothballs...
SomeGuru
I am not sure if this is related, but starting with v0.7.47 I released performance improvements for MongoDB that would "bulk" Get/Set/Remove operations. Since over the last week there have been issues, out of caution I just pulled back these changes and put the old MongoDB code back in v0.7.55. If you are using MongoDB, update to v0.7.55 and let me know if the server runs better. It could be a specific version of MongoDB, not sure. I will be re-releasing this feature as an optional option in the future since the speed gains are impressive when the server is under load.
I will note that generally, getting "TLS Handshake Error" is not MeshCentral related. These types of errors occur at a network layer that is below MeshCentral and so is network related. Looking at your issue, I am somewhat tempted to say that you suddenly started running on a bad batch of RAM, bad IO signaling or got hit with a big solar flare. Cryptographic operations are generally really robust, it's super weird that this is happening.
Also, I like the idea of having a periodic agent restart maintenance timer. Certainly something we should look it. Of course, feedback on this is appreciated. Not sure how other similar software deals with this.
I've been noticing the same issue with some systems and updating to Windows 10 20H2. Definitely just a small percentage of systems. But I've had two recently (one just last night). The 20H2 update installs and I can still connect remotely- at least until that first reboot. Once you reboot the system I can no longer remotely connect to the device. Now, that said, the connection is made but you can't view the desktop. It's just a black screen. Commands sent to the device do get through. This is not a MeshCentral thing as it also impacts Splashtop and Premium Remote Control (which is what Avast CloudCare uses). Weird thing is if I send a command to reboot the device to Safe Mode with networking I can then successfully connect to the device. But, as soon as you reboot to normal mode, can't see the desktop again. At least not until the user logs in to the desktop at least once. After the log in at least once then I can connect without any issues- even after a reboot.
Very, very strange- and extremely annoying/troublesome. No idea what changes Microsoft has made that is causing it though.
I've been noticing the same issue with some systems and updating to Windows 10 20H2. Definitely just a small percentage of systems. But I've had two recently (one just last night). The 20H2 update installs and I can still connect remotely- at least until that first reboot. Once you reboot the system I can no longer remotely connect to the device. Now, that said, the connection is made but you can't view the desktop. It's just a black screen. Commands sent to the device do get through. This is not a MeshCentral thing as it also impacts Splashtop and Premium Remote Control (which is what Avast CloudCare uses). Weird thing is if I send a command to reboot the device to Safe Mode with networking I can then successfully connect to the device. But, as soon as you reboot to normal mode, can't see the desktop again. At least not until the user logs in to the desktop at least once. After the log in at least once then I can connect without any issues- even after a reboot.
Very, very strange- and extremely annoying/troublesome. No idea what changes Microsoft has made that is causing it though.
@PathfinderNetworks You are right...It's not MeshCentral. Try removing the primary display driver and installing the generic driver. I have seen this recently with 20H2....I think it's called the black screen of death. RDP has been a way to overcome it because RDP uses it's own display driver. Anyway, may be worth checking into.
I'll have to remember that the next time I run across this. That also explains why it works in Safe Mode- as it's using the generic display driver.
I'll have to remember that the next time I run across this. That also explains why it works in Safe Mode- as it's using the generic display driver.
Yes Sir!...That is 100% correct.
Oh my! That is wonderful information. You guys rock!
@LPJon and @PathfinderNetworks,
Are you guys seeing anything in the console when you are seeing this happen? like if you open a desktop and it is connected but black? does the console show any messages if so what are those? I am seeing it again this evening on a lot of systems...
Running the enhancements on the Config.json file for multiplexer and quality... just wondering what you are seeing.
Thanks,
SomeGuru
@LPJon and @PathfinderNetworks,
Are you guys seeing anything in the console when you are seeing this happen? like if you open a desktop and it is connected but black? does the console show any messages if so what are those? I am seeing it again this evening on a lot of systems...
Running the enhancements on the Config.json file for multiplexer and quality... just wondering what you are seeing.
Thanks,
SomeGuru
No console messages. Have noticed that sometimes there is a black screen from the display being off. Pressing a key on the keyboard usually wakes it up.馃槂 If that's your case just turn off the "Turn off display" setting in "Power Options" which is in "Control Panel".
@LPJon,
More specifically power control on a lot of my devices have been bumped to performance mode and disable devices from going into sleep states when not in use. Still seeing a message in the console that reads "TLS Handshake issue" on and off HTTP Proxy networks. Resolution could be if able to add IP address to SAN field in certificate process, however this is not the most common method to resolve issue from transport back from Agent to server side. That is why I asked if you all had any results of a error in the Console tab. It took mine about a minute after connection to desktop mode and a black screen to reproduce the error to console, but it doesn't show as an error on server error log.
SomeGuru
@SomeGuru Are you using a local network scenario or is this actually going out across the internet. If its going out across the internet do you control the local network? If you control the local network do you have things like DPS or website filtering on your edge firewall enabled on either side of the connection (Server or Client or Both)?
WAN, and no filter. However running two different scenarios to reproduce, one being a standard network the other being a proxied network. This has proven to be challenging for sure to isolate what its is, based on the fact that the proxy report is less than desirable for results they can use to even further help isolate. I know that they will figure out something in regards.
SomeGuru
WAN, and no filter. However running two different scenarios to reproduce, one being a standard network the other being a proxied network. This has proven to be challenging for sure to isolate what its is, based on the fact that the proxy report is less than desirable for results they can use to even further help isolate. I know that they will figure out something in regards.
SomeGuru
So am I to understand that you are using an outbound web (internet) proxy and not a web _SERVER_ proxy? I'm not sure I understand what a "Standard" and "Proxied" _network_ are in the context of your description.