This is really simple to replicate, but I'm terrible at explaining things so please bear with me.
If you run server.cr first, then run client.cr it works perfectly! CPU Usage of client.cr is 0-1%.
However, if you run client.cr first, then server.cr, it works.. except there is one catch. The CPU usage for your client.cr in top
is at constant 100%, indefinitely! Even after it made the connection.
You might be wondering what's the use case? Well, I have a Master Server and if it goes down and back up, I want my game instance servers to automatically connect and authenticate themselves. It works beautifully, except this part :P
server.cr:
require "socket"
IP = "0.0.0.0"
PORT = 9300
server = TCPServer.new(IP, PORT)
puts "Server bound on port #{IP}:#{PORT}"
def handle_connection(socket)
puts "New connection! #{socket}"
loop do
payload_size = socket.read_bytes(Int32)
message = socket.read_string(payload_size)
puts message
end
rescue e
puts e
end
while socket = server.accept?
spawn handle_connection(socket)
end
client.cr:
require "socket"
def handle_connection(socket)
begin
loop do
payload_size = socket.read_bytes(Int32)
puts payload_size
raw_message = socket.read_string(payload_size)
puts raw_message
end
rescue e
puts "Disconnected from the Master Server"
puts e
end
end
def gogo
begin
puts "Waiting for the server to become connectable.."
check = TCPSocket.new("0.0.0.0", 9300)
check.tcp_nodelay = true
puts "Successfully connected to the Master Server"
spawn handle_connection(check)
loop do
sleep 1
end
rescue e
puts e
end
end
gogo()
Try the following code @girng
Server:
require "socket"
def handle_connection(socket)
loop do
puts "Connected to the Client"
socket.puts "message\n"
puts "Message has been send"
sleep 1
end
rescue e
puts "Disconnected from the Client"
puts e
end
begin
puts "TCPServer"
server = TCPServer.new("0.0.0.0", 52300)
while client = server.accept?
spawn handle_connection(client)
end
rescue e
puts e
end
Client:
require "socket"
def handle_connection(socket)
loop do
message = socket.read_string(5)
puts message
end
rescue e
puts "Disconnected from the Master Server"
puts e
end
loop do
begin
check = TCPSocket.new("0.0.0.0", 52300)
#check.tcp_nodelay = true
puts "Successfully connected to the Master Server"
spawn handle_connection(check)
rescue e
puts e
end
sleep 1
end
This code works great on my system without a CPU going at 100%.
Run under Ubuntu VM with a 1700x.
Tested:
Start Server / Start Client ...
Start Client / Start Server ...
Start Server / Start Client / Stop Client / Start Client ...
Start Client / Start Server / Stop Server / Start Server / Stop Server ...
All work normally here ... CPU usages goes back to 0% after all attempts.
In your code example there are several errors:
Ty @bowyan unfortunately still maxes out at 100% cpu usage.
Btw, your code is giving me a ton of re-connects: https://paste.ee/p/bC4z0
And my code doesn't do that, and your code still maxes out the CPU :/
I don't believe my code example above has any errors (to show this example)
The CPU maxing out during the message sending and receiving is very normal. The server is non-stop sending messages to the client, and the client is processing this as fast as possible.The red line is IO blocking that is going on.
But the issue you mentioned is that even when you disconnect the server, it keeps doing 100%. Please try to disconnect the server with the client active. Normally it will see:
Disconnected from the Master Server
End of file reached
Error connecting to '0.0.0.0:52300': Connection refused
Error connecting to '0.0.0.0:52300': Connection refused
Error connecting to '0.0.0.0:52300': Connection refused
Error connecting to '0.0.0.0:52300': Connection refused
Error connecting to '0.0.0.0:52300': Connection refused
Error connecting to '0.0.0.0:52300': Connection refused
And your CPU usage will go to a few percentage after 1 or 2 seconds in htop.
The CPU maxing out during the message sending and receiving is very normal.
No it isn't lol
No it isn't lol
Check your htop ... green is normal CPU usage. Red is IO blocking. The code is clearly blocking on IO. And this is beside the point.
Your reported issue is that when you disconnect the server, aka kill the server, the CPU usage keeps at maximum usage. Test this out...
No it isn't lol
Check your htop ... green is normal CPU usage. Red is IO blocking. The code is clearly blocking on IO. And this is beside the point.
Your reported issue is that when you disconnect the server, aka kill the server, the CPU usage keeps at maximum usage. Test this out...
I don't mean to be rude but are u trolling? This is a serious issue and what you are saying is not true.
Here is me pinging the server every 1 second, and cpu usage is 0-1%
https://gyazo.com/b2db069e2493bdcfa86666ef28b2e5b0
socket.cr: https://paste.ee/p/f5dGG
server.cr: https://paste.ee/p/sfxCt
However, if I start client.cr first, then server.cr, the client.cr cpu usage goes to 100%. That is not normal.
Except, if you start server.cr first, then client.cr, CPU usage is 0% (normal behavior)
I don't mean to be rude but are u trolling?
Solve it yourself with that attitude.
Solve it yourself with that attitude.
Thanks for ruining my github thread. I reduced my code example so the devs can see the issue and you posted false information about CPU usage, trying to say my code is wrong, and you did not even read my OP
Your reported issue is that when you disconnect the server, aka kill the server, the CPU usage keeps at maximum usage. Test this out...
That's not my issue at all. The issue is if I start client.cr first, then server.cr. It has nothing to do with "disconnecting the server". That's literally what my OP says...
require "socket"
def handle_connection(socket)
loop do
message = socket.read_string(5)
puts message
end
rescue e
puts "Disconnected from the Master Server"
puts e
connect_to_server()
end
def connect_to_server
check = TCPSocket.new("0.0.0.0", 52300)
# check.tcp_nodelay = true
puts "Successfully connected to the Master Server"
handle_connection(check)
rescue e
puts "TCPSocket disconnected"
puts e
sleep 2.second
puts "TCPSocket reconnectin attempt"
connect_to_server()
end
connect_to_server()
require "socket"
def handle_connection(socket)
i = 0
loop do
puts "Connected to the Client"
socket.puts "message\n"
puts "Message has been send" + i.to_s
i = i + 1
sleep 5
end
rescue e
puts "Disconnected from the Client"
puts e
end
begin
puts "TCPServer"
server = TCPServer.new("0.0.0.0", 52300)
while client = server.accept?
spawn handle_connection(client)
end
rescue e
puts e
end
Here is your problem solved.
Remove "sleep 5" in the server code, to flood your client connection with thousand of messages and get 100% IO blocking Cpu usage.
Under normal usage ( set a delay on the sending ). you will see almost no cpu usage.
This code also efficiently reconnects from disconnected connections. Its not going to win a beauty price but it does the job.
I hope this closes the topic... /The "Troll" is out...
Of course that's going to increase IO if I remove sleep 5. IO is not the issue because my code examples don't have an IO problem (or this issue). You introduced IO problems with your code..
loop do
payload_size = socket.read_bytes(Int32)
message = socket.read_string(payload_size)
puts message
end
Is perfectly valid Crystal code for client->server communication with TCP and works great w/ 0-1% CPU usage. The issue is, if you start client.cr first and wait a second, then start server.cr.
If you start server.cr first, then client.cr it works fine. I don't know how many times I need to explain this man :/
sigh .... this is the code you posted.
Server:
loop do
payload_size = socket.read_bytes(Int32)
message = socket.read_string(payload_size)
puts message
end
Client:
loop do
payload_size = socket.read_bytes(Int32)
puts payload_size
raw_message = socket.read_string(payload_size)
puts raw_message
end
Both are trying to read from the connection with nothing being send. That is also wrong as a example issue.
The issue is, if you start client.cr first and wait a second, then start server.cr. If you start server.cr first, then client.cr it works fine.
No it does not! In both orders my code works without any issue. All you need to do is take care, that you do not end up in a infinite loop with reconnect the TCPSocket.
I don't know how many times I need to explain this dude :/
I do not know how many times i need to explain to you, the above code works with connects and disconnects without the issue that you describe.
And no offence but for somebody asking for help, all i see is constant attitude. Your accusing somebody as a troll for pointing out that something in the code is IO blocking, what it is ( it was TCPSocket Looping, that was a error on my part )!
But this new code fixes that issue and works beautifully. Connect Server, then Client, no issue. Client, then Server, no issue. CPU usage is normal. If i flood the connection with 1000s of requests per second, yes, it will hit large CPU usage as that is normal when your sending thousands of requests per second.
But it does not go into large CPU usage when the client or server get disconnected and reconnect ( unless you flood the connection, again, normal ).
Now i am suddenly your "dude". I know you are frustrated but lashing out at people only makes things worse!
No it does not! In both orders my code works without any issue. All you need to do is take care, that you do not end up in a infinite loop with reconnect the TCPSocket.
@bowyan It still maxes out the CPU if you start client.cr first, then server.cr.
Now i am suddenly your "dude". I know you are frustrated but lashing out at people only makes things worse!
I'm not lashing at anyone. You came in here, claimed my code was wrong, said CPU utiziliaton is supposed to be 100% when communicating messages between client->server (which is not true). And made up code with actual IO issues to try to say my code (that doesn't have IO issues) is wrong. You are lashing out at me passively, which is uncalled for. I'm just here reporting a potential bug.
And no offence but for somebody asking for help, all i see is constant attitude. Your accusing somebody as a troll for pointing out that something in the code is IO blocking
You are wrong. My code examples are not IO blocking to warrant 100% CPU usage. They work just fine if you start server.cr first, then client.cr (stays at 0-1%, even when ping/ponging). IO blocking is not relevant to this issue and not what I'm reporting..
Then you have some magical bug because i can start the client first, wait one second and see my cpu usage under a VM after the initial startup spike going to 0 a 2% on all 4 cores, withing a second and staying like this.
Start server with flood: 60 a 70% core 1, 20% on core 2 .. ( mostly from the put messages being pushed to the CLI ). Stop server ... Client goes into search mode for the server and going to 0 a 2% on all 4 cores.
Aka exactly how it needs to work. In other words, if you are running my exact same code as posted above and you have the issue that you mention about, then the issue is on your setup. Older Crystal? WSL? LLVM issue? ...
You are wrong. My code examples are not IO blocking.
I give up, no reasoning with you ...
/Crystal moderators ... @RX14, ... anybody .... please remove my posts from this topic.
I give up, no reasoning with you ...
See edit. Not IO blocking to warrant 100% CPU usage, that you claimed.
Both are trying to read from the connection with nothing being send. That is also wrong as a example issue.
That doesn't mean it has to go to 100% CPU usage. I was just not sending anything, because I wanted to reduce my code example as much as possible. If you check my previous post, CPU usage is 0-1%, even when sending a ping per second while using that code.
It's not "wrong as an example issue". It's perfect, and shows the issue very clearly.
Please keep it civil, personal accusations have no place in the crystal community.
[...] even when sending a ping per second while using that code.
@girng sending ping per second sounds way less than sending it as fast as possible, just thinking out loud...
Also, I can't reproduce.
crystal client.cr
immediately fails if server.cr
is not started. This makes sense because there is no retry logic if TCPSocket.new
fails.
@girng sending ping per second sounds way less than sending it as fast as possible, just thinking out loud...
I am not sending anything as fast as possible in my examples..
I am about done here. This is getting ridiculous.
I literally have done nothing wrong, but getting mass accusations about stuff that my code doesn't even do. I'm just trying to report a potential bug, that is happening to me
crystal client.cr
immediately fails ifserver.cr
is not started. This makes sense because there is no retry logic ifTCPSocket.new
fails.
Thank you. Okay, now we're getting somewhere. For me, it just hangs then connects instantly when server.cr is started. WSL issue maybe?
@girng There might be an old server process hanging around that you haven't killed. Try rebooting?
@RX14 Okay, I tried it after a reboot, same results: (client.cr on right, server.cr on left)
https://i.gyazo.com/029417fb0019ec695b7a5b30e4e53067.mp4
It hangs, then connects to the server.cr when it starts up, but remains at 100% CPU usage. But, if server.cr is started before client.cr connects, the cpu usage goes down.
I literally have done nothing wrong, but getting mass accusations about stuff that my code doesn't even do. [...]
@girng Chill out, please. As I see it, this ain't about right or wrong, more about the attitude of approaching the problem. No one here _accuses_ you or your code of anything, we all are just trying to find out the root source of the problem. Misunderstandings and differents PoV are quite common thing, no need to get personal about it...
No it isn't lol
I don't believe my code example above has any errors (to show this example)
I don't mean to be rude but are u trolling? This is a serious issue and what you are saying is not true.
Thanks for ruining my github thread. I reduced my code example so the devs can see the issue and you posted false information about CPU usage, trying to say my code is wrong, and you did not even read my OP
I am about done here. This is getting ridiculous.
All of the above quotes seem to me quite personal and inappropriate, especially towards someone who tries to help you - even if there's lack of understanding the problem, from one side or the other.
@girng Chill out, please. As I see it, this ain't about right or wrong, more about the attitude of approaching the problem. No one here _accuses_ you or your code of anything, we all are just trying to find out the root source of the problem. Misunderstandings and differents PoV are quite common thing, no need to get personal about it...
Put yourself in my shoes. I'm not the one going around accusing other people their code is wrong, and making up code to say my code has an IO issue when it's not even related at all. I tried to keep my code reduced and simple for the developers to take a look at. Then you are saying I'm sending data non-stop, which is not true and my code does not even do that. Some of you guys are passively trying to put me down and put fault on me, when it's not my fault at all.
@Sija this post is probably best sent through a different channel. Please keep this thread technical only.
@girng are you sure the code you're running is the exact same as the code in the OP? This is what I get when I run client.cr
without server.cr
running. It's the same if i use 127.0.0.1
instead of 0.0.0.0
@RX14 Yep, 100%.
https://i.gyazo.com/7aa23bb1c2b0fda6f6fe96be7b4f39bd.mp4
e: The "Error connecting to '0.0.0.0:9300'" is never emitted
@girng well it must be some kind of WSL bug then because that's absolutely not how TCP should behave.
By the way, I feel very disincentivized to make another bug report here again. Just re-reading this issue really makes my heart sink. It's just sad, no reason to treat people like this who are reporting a legitimate and valid issue they are experiencing. I appreciate RX14's help though, at least he understands where I'm coming from instead of creating false accusations and trying to constantly put the OP down.
@girng this still isn't the place or time, but I'd say that you shouldn't take the actions of a single community member as representative of the whole issue tracker.
Most helpful comment
Please keep it civil, personal accusations have no place in the crystal community.