Expected: Anti-xray engine mode 1 or 2 should not affect player connection issues or ping.
When enabled on either engine mode, some players are either unable to connect, frequently timeout, or experience extremely high ping. When disabled, there is no ping issue whatsoever.
Unsure of how to reproduce or exact causes. It does not affect all players equally. For the player in question I used for testing, these were our procedures:
A video can be provided if needed. I am not able to record one as of the time of making this report.
Default paper.yml causes no issues: https://pastebin.com/i2vMEkLw
With anti-xray on: https://pastebin.com/bFvaZgUx
spigot.yml: https://pastebin.com/VsfA0KzP
bukkit.yml: https://pastebin.com/nFScUQyW
server.properties: https://pastebin.com/UVqLNZiM
I have gone through testing without plugins as well and have not been able to find any other way to fix the issue. After much troubleshooting, I came across anti-xray being to blame. I am wondering if it is perhaps a packets issue? It seems to only affect some players (I and an admin of mine can connect, but a handful of users and mod staff have this issue.) Our bungee network is 8 servers and only on our Survival server (the only one we've used anti-xray on so far) is the only one with the issue. Switching to any other server through the proxy yields perfect results and good ping.
I have a few questions:
Additional infos about Anti-Xray that might be related here: When the server sends a chunk packet to the client and Anti-Xray is enabled, the server's network manager is is queueing the chunk packet and all other subsequent packets (except a few special packets, see NetworkManager.InnerUtil.canSendImmediate(NetworkManager, Packet<?>)) until the chunk packet obfuscation has been finished on the async thread. This is done to ensure that the packets are sent in the correct order to the client, while still being able to continue ticking the server on the Server thread. In principle this queue can cause higher pings if the thread pool queue is filled up with lots of packets or other stuff to be executed and there are not enough threads available or the CPU is already fully occupied. I cannot reproduce this issue on my server.
It is independent of the number of players. During testing it was just me, and two staff (one with the issue and one without.) We had the server in maintenance mode to see what was going on.
CPU is an R7 3800X. That server can use up to a max of 6 threads, but usage is usually very low. CPU runs other servers on the network but are far smaller, using only 2 threads each, with the rest reserved for the system. Server memory is 12GB.
Server does not lag at all during this. Using spark, as well as the built in TPS checks, it never has a drop because of this. I can't currently provide a timings report but can when I'm able to get back to testing it out.
The player who helped me test earlier does generally have a weaker connection (Los Angeles to Montreal), but almost always levels out around 120ms. With anti-xray on the server, the ping increases well above 1000-3000ms.
I just did a quick test by adding a 5 seconds delay before chunk packets are flagged as ready by Anti-Xray. This doesn't affect the ping at all. So it seems like the issue is not the queue.
Could you provide a /protocol dump?
That shows which plugins use protocollib and which packets they listen for
Here is the /protocol dump from the server.
https://pastebin.com/PNBt2pmM
This is unrelated but EntityTrackerFixer does more damage than not as Paper does many of the same or better optimizations already.
Thanks for the advice! Related or not, I appreciate it!