I've noticed this the last couple of times I've restarted my MeshCentral server and it's always extremely troubling.
I just updated my server from 0.3.5-x to 0.3.6-c. I restarted the physical server after that update to let some Windows Updates install (this is a Server 2019 box).
The server starts up correctly as well as the MeshCentral service. No issues there. I log in to the MeshCentral site and see that only 4 or 5 out of over 300 devices/agents show as being connected. It takes about 20 minutes (maybe more) before the devices start to reconnect. When they do it seems they all start connecting at the same time.
Is there some timing in the agent that controls this or is there something else going on? It's not a huge deal as long as they all reconnect eventually- it's just VERY disconcerting to log in and see that nearly all the devices are showing as being disconnected.
YES! I saw the same thing on one of my servers and it freaked me out a few weeks back. I am glad you saw the same thing. I am going to need to investigate this ASAP. I suspect the re-connection timer has a problem, but could be something else.
Started some testing on this. Obviously, the the server changes IP address and the DNS TTL is one hour, it will take some times for the DNS entry to update to the new IP address. So, in that case, the wait would be normal. I am however doing testing now when the server does not change IP address.
In my case the server has a static address (both a static LAN address and a static WAN address where the ports are forwarded through). So no addresses are changing in my case- nor is the hostname changing.
Ok. I suspected it was not a IP address change, but wanted to mention it.
I just did some testing. We intended to cap the connection retry time to 5 minutes at most, but current agent does not seem to have a limit and so, after a day of retry, it's now only retrying connection every hour and increasing the time between retries by a few seconds each time it does try. So, that should get fixed.
Published MeshCentral v0.3.6-n with Bryan's new MeshAgent that has a retry timer fix. MeshAgent will at most take 6 minutes to retry connection to the server, even if the server was down for a long time. This should solve the issue.
I'll let you know if I see anything different the next time I have to reboot the server. Thanks!
Closing this one as it's a duplicate of #345