Connection are not closed correctly by default
What I means is: if you do a "netstat -ano" you will see connection list growing and growing.
Not all old connection get closed.
Tested with StaticFile handler ( HTTP2 or not ) on a website with ~20k+ daily users
With the default config -> Not closed correctly
With .keep_alive(None) -> Not closed correctly
With .keep_alive(server::KeepAlive::Timeout(5)) -> All connections are closed correctly
Initially detected in this ticket #426 but it was not the main issue
I was wrong with that
With .keep_alive(server::KeepAlive::Timeout(5)) -> All connections are closed correctly
I have some connection that are open for more than 1 day
Master should fix this probables
Ok, trying
do you still see opened connections?
Yes. connections are not closed correctly yet
Source code I used to test: https://git.neowutran.ovh:8080/neowutran/Website/commit/6e1a7ec41c4d521d42a8df81b01c469fe77b4a9c
Script I use to check if connection are closed correctly
[root@xxxxxx]# cat ./sockets_uptime.sh
#!/bin/bash
function suptime() {
local addr=${1:?Specify the remote IPv4 address}
local port=${2:?Specify the remote port number}
local min_time=${3:?Specify the min uptime in seconds for socket}
local min_time=${min_time%.*}
# convert the provided address to hex format
local hex_addr=$(python -c "import socket, struct; print(hex(struct.unpack('<L', socket.inet_aton('$addr'))[0])[2:10].upper().zfill(8))")
local hex_port=$(python -c "print(hex($port)[2:].upper().zfill(4))")
# get the PID of the owner process
local pid=$(netstat -ntp 2>/dev/null | awk '$6 == "ESTABLISHED" && $5 == "'$addr:$port'"{sub("/.*", "", $7); print $7}')
#local pid=$(netstat -ntp 2>/dev/null | awk '$5 == "'$addr:$port'"{sub("/.*", "", $7); print $7}')
[ -z "$pid" ] && {
# echo 'Address does not match' 2>&1;
return 1;
}
# get the inode of the socket
local inode=$(awk '$4 == "01" && $3 == "'$hex_addr:$hex_port'" {print $10}' /proc/net/tcp)
[ -z "$inode" ] && { echo 'Cannot lookup the socket' 2>&1; return 1; }
# query the inode status change time
local timestamp=$(find /proc/$pid/fd -lname "socket:\[$inode\]" -printf %T@)
[ -z "$timestamp" ] && { echo 'Cannot fetch the timestamp' 2>&1; return 1; }
local diff=$(bc <<< "$(date +%s.%N) - $timestamp")
# truncate to int
local diff=${diff%.*}
if (( $diff > $min_time )); then
# compute the time difference
LANG=C printf '%s (%.2fs ago)\n' "$(date -d @$timestamp)" $diff
echo $addr:$port
fi
}
min_time=${1:?Specify the min uptime in seconds for socket}
ports=${2:?Specify the ports list, like ':8083|:443'}
#ports=':8083|:443'
echo "min uptime in seconds for sockets: $min_time"
netstat -ano | grep -E $ports | sed -e "s/[[:space:]]\+/\t/g" | cut -f5 | while read line
do
ip=$(echo "$line" | cut -d ':' -f1 )
port=$(echo "$line" | cut -d ':' -f2)
suptime $ip $port $min_time
done
(based on https://superuser.com/questions/565991/how-to-determine-the-socket-connection-up-time-on-linux )
usage ./sockets_uptime.sh 3600 ':443' 2> /dev/null
Show socket connected to port 443 that where created more than 1 hour ago.
After ~2 hours, I have 23 sockets not closed correctly
But there is an interesting thing about them:
15 sockets where created exactly at 18:07:13
2 sockets where created exactly at 18:15:32
6 sockets where created exactly at 18:26:28
The process was started at 18:05:33
Some "burst" of sockets not closed.
The bad thing: I am unable to reproduce this issue without a heavy load ( port 443, the service is mainly used to download big files like this one https://github.com/actix/actix-web/issues/454#issue-348463796 ).
Process is still running, will update the stats in few hours
Now I am at 32 sockets not closed correctly.
15 sockets where created exactly at 18:07:13
2 sockets where created exactly at 18:15:32
6 sockets where created exactly at 18:26:28
4 socket where created exactly at 18:55:45
The others socket where created at different time, but nothing really interesting with them.
Don't have any unclosed socket created between 22h and 06h ( not much traffic in these hours )
Process is still running.
I expect a small traffic spike today at around 17h/18h, so I think a lot of socket won't be closed at around 17h
Now: ~50 sockets not closed correctly
As expected, most of the new sockets not closed correctly have been created around 17/18h
More than 2 sockets unclosed where created exactly at:
So definitly Some "burst" of sockets not closed.
Killing the process
@fafhrd91
@DoumanAsh
Do you have an idea ?
I can easily reproduce this case on my server when it is heavily loaded.
Do you know a part of the code ( / external crate ) handling the sockets that could be sensitive to timestamp / race condition ?
I can add logs / trace where you want if you have an idea.
But quite a complexe issue
I am a bit busy atm, I’ll get back to this bug this weekend
@neowutran I might have fixed this. Try this pull-request https://github.com/actix/actix-web/pull/483
@neowutran i merged pr, could you try master
Updated everything
Tested both with http1 only & http1/htt2
There are still unclosed sockets.
Last logs from the service with http1 only:
[root@xxxxxxxx ~]# ./sockets_uptime.sh 3600 443
min uptime in seconds for sockets: 3600
Mon Aug 27 12:31:08 CEST 2018 (101857.00s ago)
xxx.xxx.xxx.xxx:50289
Mon Aug 27 11:37:33 CEST 2018 (105073.00s ago)
xxx.xxx.xxx.xxx:55272
Mon Aug 27 11:23:44 CEST 2018 (105902.00s ago)
xxx.xxx.xxx.xxx:9257
Mon Aug 27 11:08:56 CEST 2018 (106790.00s ago)
xxx.xxx.xxx.xxx:53753
Mon Aug 27 12:31:08 CEST 2018 (101858.00s ago)
xxx.xxx.xxx.xxx:54859
Mon Aug 27 11:37:32 CEST 2018 (105075.00s ago)
xxx.xxx.xxx.xxx:6722
Mon Aug 27 11:37:30 CEST 2018 (105077.00s ago)
xxx.xxx.xxx.xxx:49868
Mon Aug 27 11:29:07 CEST 2018 (105580.00s ago)
xxx.xxx.xxx.xxx:8918
Mon Aug 27 11:37:30 CEST 2018 (105077.00s ago)
xxx.xxx.xxx.xxx:49867
Mon Aug 27 11:37:33 CEST 2018 (105075.00s ago)
xxx.xxx.xxx.xxx:53231
Mon Aug 27 11:23:43 CEST 2018 (105905.00s ago)
xxx.xxx.xxx.xxx:47600
Mon Aug 27 11:29:07 CEST 2018 (105581.00s ago)
xxx.xxx.xxx.xxx:8919
Mon Aug 27 11:37:33 CEST 2018 (105076.00s ago)
xxx.xxx.xxx.xxx:11313
Mon Aug 27 11:37:30 CEST 2018 (105079.00s ago)
xxx.xxx.xxx.xxx:55125
Mon Aug 27 11:37:30 CEST 2018 (105079.00s ago)
xxx.xxx.xxx.xxx:55894
Mon Aug 27 11:09:01 CEST 2018 (106789.00s ago)
xxx.xxx.xxx.xxx:9254
Mon Aug 27 11:08:56 CEST 2018 (106795.00s ago)
xxx.xxx.xxx.xxx:9424
Mon Aug 27 12:31:08 CEST 2018 (101863.00s ago)
xxx.xxx.xxx.xxx:5936
Mon Aug 27 11:37:30 CEST 2018 (105081.00s ago)
xxx.xxx.xxx.xxx:52727
Mon Aug 27 11:37:30 CEST 2018 (105081.00s ago)
xxx.xxx.xxx.xxx:25074
Mon Aug 27 11:37:30 CEST 2018 (105082.00s ago)
xxx.xxx.xxx.xxx:52718
Mon Aug 27 12:31:08 CEST 2018 (101864.00s ago)
xxx.xxx.xxx.xxx:11314
Mon Aug 27 12:31:08 CEST 2018 (101864.00s ago)
xxx.xxx.xxx.xxx:29972
Mon Aug 27 12:31:08 CEST 2018 (101864.00s ago)
xxx.xxx.xxx.xxx:37362
Mon Aug 27 12:31:08 CEST 2018 (101865.00s ago)
xxx.xxx.xxx.xxx:54365
Mon Aug 27 11:23:43 CEST 2018 (105910.00s ago)
xxx.xxx.xxx.xxx:60080
The service was launched on my port 443, I didn't used any stress tool on it
There is still this "burst" pattern, with many socket created at the exact same time and that didn't get closed
I am unable to reproduce this behavior on localhost. Will see if I can consistently reproduce that on a remote server with a http stress tool
@neowutran is this with Os keep-alive or timeout?
For this test I just used the default settings
The source code for this test is: https://git.neowutran.ovh:8080/neowutran/Website/src/31e38fe5e3dabead906b3b2ff52a47e17a321295/src/main.rs
( & I used the latest version on master https://git.neowutran.ovh:8080/neowutran/Website/src/31e38fe5e3dabead906b3b2ff52a47e17a321295/Cargo.lock#L32 )
@neowutran could you test with keep alive timeout
Ok, will relaunch the service with this modification:
https://git.neowutran.ovh:8080/neowutran/Website/commit/5da6a499bd87d3ca4a91442fe3cfb6e2ebbef9e8#diff-04d0cca2a951158b1cd945c941bf6c630572e52R222
Seem ok ?
Looks good
Yup, same thing
[root@vps193339 ~]# ./sockets_uptime.sh 3600 443 2> /dev/null
min uptime in seconds for sockets: 3600
Tue Aug 28 19:44:57 CEST 2018 (76319.00s ago)
xxx.xxx.xxx.xxx:55605
Tue Aug 28 19:44:57 CEST 2018 (76320.00s ago)
xxx.xxx.xxx.xxx:49684
Tue Aug 28 19:44:57 CEST 2018 (76320.00s ago)
xxx.xxx.xxx.xxx:1379
Same condition:
The service was launched on my port 443, I didn't used any stress tool on it
another thing related: after a stress test, CPU consumption stay at 100% for a very long time
Do you see not closed connections?
Only ~50
@neowutran I have seen an issue with connections not being closed.
I have just setup a test environment (two computers on gigabit LAN) for stress testing. I can cause actix-web to have connections that are not closed (ESTABLISHED state on the server, gone on the client). I haven't seen this problem with localhost stress testing, even with 200,000 concurrent connections.
I have been able to cause it some times with:
wrk -c 4000 -d 10 -t 20 http://server:8080/
I happens more often with higher concurrent connection counts. Using 20 threads seems to help cause the problem, even though the client only has 2 cores + hyper-threading.
The hung connections that I was seeing were caused by dropped RST/FIN packets. The kernel will never closing these connections, until a new RST packet is sent from the client. That might never happen.
@neowutran Do you still have keepalive enabled? That should have closed those connection if they were idle. Maybe use tcpdump to check if the connection is idle for longer then the keepalive timeout.
@fafhrd91 I think the default KeepAlive::Os is not a good default. Connections can hang around for a very long time if the RST/FIN packets are lost.
@fafhrd91 Even with KeepAlive::Timeout(30) some connections are not closed.
@Neopallium when you stop server does actix reports not closed connections?
@fafhrd91 Yes. Each worker thread reported about 350 connections when actix-web was shutdown with Ctrl-C. actix-web = 0.7.6 release build.
The leaked connections doesn't always happen. High concurrent connection count (200,000) seems to be the easiest way to cause it.
I have been testing with:
wrk -c 200000 -d 60 -t 20 -s scripts/ports.lua http://server:8080/
scripts/ports.lua file changes the connecting port (8080 -> 8099) for each thread.
local next_port = nil
function setup(thread)
local port = next_port or wrk.port or 80
next_port = port + 1
local addrs = wrk.lookup(wrk.host, port)
thread.addr = addrs[1]
end
function init(args)
print(("thread addr: %s"):format(wrk.thread.addr))
end
The server is just the hello-world example changed to listen on ports 8080-8099. This make it easier to do high concurrent connection testing with just one client computer.
I will do some more testing tomorrow, to see if I can narrow down the problem.
@fafhrd91 It looks like the leaked connections are closed before sending a HTTP request and the RST/FIN packet is lost. The keep-alive timer isn't created until the connections protocol has been detected (http1/http2).
I think Actix-web should have a timeout for receiving the first request (nginx has a few timeouts like client_header_timeout, client_body_timeout). We can't rely on the OS always detecting a closed connection (unless TCP keep-alive is enabled).
@Neopallium do you think actix process does not get HUP event from kernel?
@Neopallium could test master again, i added slow request timer
@fafhrd91 based on tcpdumps (client and server side), I know that the server is not receiving the RST/FIN packet. So the socket on the server machine stays in ESTABLISHED state. Without some activity from the server side, the socket will just stay in the ESTABLISHED state, until the client tries to connect again with the same source port.
I did some testing with master. No leaked connections now.
@neowutran can you try master to see if it fixes the problem you are seeing.
Ok, trying master
restarted service, consuming 0.1% of server ram for the moment
waiting few hours
It randomly segfault now
Attaching valgrind to it
1 time it segfaulted after 10 seconds,
another time it segfaulter after 2 hours
New crash after few minutes
https://neowutran.ovh:8080/valgrind_segfault.err
No idea how, maybe related to the patch + rustls, but idk
trying to get a real core file
@neowutran Can you recompile in debug mode. It would provide easier to read stack dumps.
After reviewing that valgrind output, it looks like maybe the HttpChannel is being polled again right after it was freed.
Also looks like it was serving a static file. Maybe the keep-alive timeout expired while it was still sending the response? Not sure.
Source code: https://git.neowutran.ovh:8080/neowutran/Website/commit/0500025bc5288198c1caba29e0b9139b62e0319d
Binary:
https://neowutran.ovh:8080/others/debug/neowutran
Stderr of the binary:
https://neowutran.ovh:8080/others/debug/log.err
Core files of the binary:
https://neowutran.ovh:8080/others/debug/
After that, tryied with valgrind on the debug binary:
https://neowutran.ovh:8080/valgrind_debug.err
I'm sorry for interrupting (I know this was recognized as a bug).
I'm using 0.7.6. Running lsof -p <pid> prints a list of ESTABLISHED connections. In code I keep a counter in a simple middleware implementation, it's just 1-2 requests per second.
@anh-ba-khia You can try master. We are currently testing a fix for this issue.
@neowutran could you try again. but crash is really strange. i'll try to reproduce
@neowutran I am not able to recreate the crash. Is it only crashing when exposed to public traffic? or can you cause it to crash with a stress test?
@fafhrd91 @Neopallium
Ya, it segfault after some time ( when exposed to public traffic ), it can be seconds, minutes or hours
If you have some idea,
or if I need to setup some packet capture tools, to record everything & keep like the last minute before the segfault
I rebuilt with revision 03e318f44649aed6c00de19bdb93cc6b0377b1fd, and the program is currently serving request. I have a simple middle implementation to count live requests:
impl<S> Middleware<S> for LiveConnectionCounter {
fn start(&self, _: &HttpRequest<S>) -> actix_web::error::Result<middleware::Started> {
// This is a `static AtomicUsize`
LIVE_CONNECTION_COUNT.fetch_add(1, atomic::Ordering::Relaxed);
Ok(middleware::Started::Done)
}
fn response(&self, _: &HttpRequest<S>, response: HttpResponse) -> actix_web::error::Result<middleware::Response> {
Ok(middleware::Response::Done(response))
}
fn finish(&self, _: &HttpRequest<S>, _: &HttpResponse) -> middleware::Finished {
LIVE_CONNECTION_COUNT.fetch_sub(1, atomic::Ordering::Relaxed);
middleware::Finished::Done
}
}
May I ask if you could verify that it is a correct way to count live connections? If it is, then I think the bug is still there. Because (again, if above is a correct way to count live connections), then while it's just 1-2, but at a same time, lsof is showing over 100 ESTABLISHED TCP connections :-(
@neowutran could you try again, i forgot to push my changes
@anh-ba-khia your code counts requests, you need connections
Different, but still memory corruption / segfault
https://neowutran.ovh:8080/valgrind_debug2.err
Crashed in few seconds with live traffic
Oh, something is really broken. Could you try OpenSSL instead of rustls
same, with openssl it still segfault in few seconds
@anh-ba-khia Your counter is for live requests. Do you have keep_alive enabled? The ESTABLISHED connections you are seeing might still be active or haven't timed out yet.
@neowutran A packet capture might help narrow down what type of request causes the crash.
Maybe a large POST request? I haven't tried stress testing with slow POST requests.
@Neopallium
Worst case scenario, I could just log everything with TCPDump until the segfault, but idk if it will help with the amount of data exchanged on the server. Unless we know what are we looking for
The only "unusual" thing the service is doing is serving large files ( ~40mo )
@neowutran wait, I’ll rewrite how actix manages active connections
@Neopallium Thank you. At first I didn't enable keep-alive. But I discovered thousands of established connections which made the program hang. Then I searched and found this issue. After reading some comments, I thought adding keep-alive would help closing those connections. Then I added a small value of 10 seconds. I'll remove keep-alive, try again and will send you update later.
I don't have crashes like Yukikoo's issue. Maybe it's because my program is very simple, it just serves json requests. (An amateur web service made by me, :-D)
@fafhrd91 node.remove() doesn't update the prev pointer in the next node.
The next node will still be pointing to the deleted channel.
@anh-ba-khia The default keep_alive is now 5 seconds. If you have too many open connections, then either set it to 1 second, or KeepAlive::Disabled
🤦♂️
@neowutran could you try master again
@fafhrd91 I was able to recreate the memory access error. It seems to still happen:
https://gist.github.com/Neopallium/ba133e7430d9398e0c7e3fccc3302000
To recreate the issue I run 3 instances of wrk, then kill the second one. This will close the connections in the middle of the linked list.
I don't understand how the invalid write can still be happening.
Last commit should fix it. Do you still see access error?
might have had a stale Cargo.lock file. rebuilding now.
the problem is still there. tested with commit 76410356
I have updated the gist with the Cargo.toml and src/main.rs that I used for testing.
https://gist.github.com/Neopallium/ba133e7430d9398e0c7e3fccc3302000
to recreate issue:
valgrind ./target/debug/hello-worldwrk -t 1 -c 10 -d 90 http://localhost:8080/@Neopallium thanks, that helped. last commit fixes access error
I don't know if this is related, but I was trying to do the websocket tutorial/examples and implement a chat, and found that my server code wasn't cleaning up clients when they disconnected. Even when running the websocket example, the StreamHandler's finished() method doesn't ever get called. That is, if I add a finished() method to examples/websocket/src/main.rs, similar to the client one, and run:
cd examples/websocket
cargo run --bin server
and also
cd examples/websocket
cargo run --bin client
and then hit CTRL-C on the client, then the server doesn't notice. Whereas if I hit CTRL-C on the server, the client does get a "Server disconnected" message within a few seconds, meaning finished() was called.
Also I notice now that the server is using WebsocketContext, while the client is using a regular Context. (Shouldn't the client also use WebsocketContext?)
This also happens with current actix-web/master / 0.7.8 and actix 0.7.0. I also tried actix master, but then the example doesn't compile anymore.
Should I open a separate issue for this?
edit: As for connections, netstat shows that the client closes its half of the socket for writing / sends a FIN (FIN_WAIT2 state) while the server is keeping its half open (CLOSE_WAIT state).
@sapir let’s use separate issue
ok thanks
@neowutran can you try the latest release 0.7.8.
@Neopallium My server uses 0.7.8. It ran for about 4 hours. I called lsof and grab 284 ESTABLISHED TCP connections. There were about total 28k+ requests (I take from logs). (I'm sorry for interrupting.)
I trust my logs that there are just 1-2 requests/second. So I can afford restarts if necessary. I'll keep my server run for some more hours, and post results here. Normally it can run for several days until it hangs up :-D
@anh-ba-khia What keep_alive setting are you using? If you are using KeepAlive::Disabled, then try again with a 1 second keep_alive.
@fafhrd91 If keep_alive is disabled, then the slow request timeout would also be disabled? I think it would be better to have a separate setting for the slow request timeout. Some people might want to disable keep_alive, but the slow request timeout should still be used.
@Neopallium makes sense
@Neopallium I'm not touching keep-alive using startup, so it uses default 5 seconds. I'll try again with 1 second.
I have some new results:
lsof -p <pid>: 932cat /proc/<pid>/status, FDSize field is 2048.FDSize is _"Number of file descriptor slots currently allocated."_ I got that description from man 5 proc.
I wish I could help more. So if you want any information, please ask me.
@anh-ba-khia can you do a tcpdump for one of the long lived connections? Run lsof -n -P -p <pid> twice about a minute apart to find an old socket. Then run tcpdump -n -i eth0 port <client port> and host <client ip>. This would help to see if the socket is actively making requests to keep from being closed. If the client-side is idle or had already closed the socket, then there shouldn't be any output from tcpdump.
I have been thinking of adding support for dumping the status of all active http connections in actix-web, it would help with debugging hang connections. I already have some code written, just need to clean it up and send a PR.
After 10h:
No segfault for the moment
No socket unclosed for the moment
Most memory leak seems to be fixed ( there are still memory leak ( aka: memory consumption increase over time) , but less than before )
Don't close the ticket yet, the service is still running, I want to let it run for few days before saying that the connection issue is fixed
And only tested with HTTP1, didn't tested HTTP2 yet
But after I will probably open a new ticket about memory with a new valgrind output
@Neopallium Thank you. I followed your instruction and ran that command on 5 different clients. The first lines are exactly the same:
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
For all 5 results, I've waited for about 1 minute and pressed Ctrl-C to stop the command. They all have 0 packets captured and 0 packets dropped by kernel. Some of them have 2, 3 or 0 packets received by filter.
@anh-ba-khia Then those are stuck connections.
The KeepAlive timer is stopped after receiving the request headers and only resumed when the request is finished.
Some possible reason for the stuck connections:
Thanks @Neopallium , that helps a lot. Hopefully I'll resolve my issue soon. I think my issue is not related to this one. And since Yukikoo confirmed that new version has been running well, I guess this issue is solved.
Thanks again for your help :-)
@anh-ba-khia You could adapt your LiveConnectionCounter middleware to detect stuck requests. This might be useful for other people too. I had been thinking about adding the request timer to the actix-web core, but I think a middleware that can be inserted as needed would be better.
sadly, still have some unclosed connections
Wed Sep 19 06:27:45 CEST 2018 (85094.00s ago)
yyy.yyy.yyy.yyy:59976
Wed Sep 19 06:27:47 CEST 2018 (85093.00s ago)
zzz.zzz.zzz.zzz:58338
Wed Sep 19 06:27:45 CEST 2018 (85095.00s ago)
zzz.zzz.zzz.zzz:58244
Wed Sep 19 06:27:45 CEST 2018 (85096.00s ago)
yyy.yyy.yyy.yyy:59975
Wed Sep 19 06:27:45 CEST 2018 (85097.00s ago)
xxx.xxx.xxx.xxx:29109
( only 3 different IP, xxx, zzz and yyy )
Idk if it can be usefull, but I only got unclosed connection on the service used for download (port 443)
For the moment don't have any unclosed connection on the service used for upload (port 8083)
On the memory consumption side:
Both services started at 0.1% of server ram (~10mo)
Now the service on port 443 is at 0.2% of ram (~20mo, ~x2 on memory consumption since the beginning)
The service on port 8083 is at 5.9% of ram (~1.5Go, ~x59 on memory consumption since the beginning)
Calling lsof takes very long time. So I coded a small program to count established TCP connections, along with oldest and newest connection timestamps.
It works via:
/proc/<pid>/net/tcp to find established TCP inodes./proc/<pid>/fd/ which are sockets _and_ have inodes belonged to above list.Currently it finds established TCP inodes in first step. You can change its behaviour if you want. I learned some constants from https://oroboro.com/finding-state-tcpip-sockets-linux/
Usage
It accepts systemd service name, or process ID. For example with my Firefox:
$ socket-counter 6497
"6497" -> 3 sockets
Oldest opened socket: 0 days 08:13:28
Newest opened socket: 0 days 08:13:28
Or via service name:
$ socket-counter some-service.service
I hope the program can help: https://github.com/anh-ba-khia/socket-counter
[root@vps193339 Website]# /root/sockets_uptime.sh 3600 443
min uptime in seconds for sockets: 3600
Thu Sep 20 06:05:42 CEST 2018 (268084.00s ago)
XXX.XXX.XXX.XXX:57134
Wed Sep 19 06:27:47 CEST 2018 (353160.00s ago)
XXX.XXX.XXX.XXX:58338
Wed Sep 19 06:27:45 CEST 2018 (353162.00s ago)
XXX.XXX.XXX.XXX:24112
Thu Sep 20 06:06:12 CEST 2018 (268055.00s ago)
XXX.XXX.XXX.XXX:5532
Wed Sep 19 06:27:45 CEST 2018 (353162.00s ago)
XXX.XXX.XXX.XXX:58244
Wed Sep 19 06:27:45 CEST 2018 (353163.00s ago)
XXX.XXX.XXX.XXX:50719
Wed Sep 19 06:27:45 CEST 2018 (353163.00s ago)
XXX.XXX.XXX.XXX:29109
Wed Sep 19 06:27:45 CEST 2018 (353164.00s ago)
XXX.XXX.XXX.XXX:53451
Thu Sep 20 06:06:00 CEST 2018 (268070.00s ago)
XXX.XXX.XXX.XXX:5533
Thu Sep 20 06:05:42 CEST 2018 (268087.00s ago)
XXX.XXX.XXX.XXX:24113
@neowutran still only leaked connections on port 443 and not 8083?
Also would it be possible to enable access logging on that port to narrow down what kind of requests are happening on the leaked connections?
I am traveling for the next two weeks. So I will not be able to help much with debugging.
I think this is the same problem. We have to start slow request timer before ssl negotiation, which is not the case at the moment
Another memory leak, probably related to this issue
https://neowutran.ovh/upload_leak.log
Live traffic, mostly upload ( files upload / json upload )
I have some updates about connections not being closed correctly.
I made some simple protocol over tcp, via tokio + futures crates. That means I replaced actix-web and rolled my own (simple) protocol. I couldn't test with wrk, so I ran my own tests. There were just about 500 requests/second. I know my implementation is poorly coded. (Last time I ran wrk on my implementation with actix-web, it handled up to 20k requests/second.)
However the thing is, no matter how many tests I ran, there were absolutely no unclosed connections when the tests finished. I ran tests again, again and again, and again... There were no unclosed connections.
Then, I used that for production code. When time goes, they appear and go up (the unclosed connections).
I just can guess that there might be some bugs in tokio or futures crates.
Not sure if this is useful, but I think I have a similar issue. Repro in this repo https://github.com/dnagir/rust-memomry-leak
But as others pointed out - it might be reproducible with a much smaller example.
@dnagir thats for test. i tried to test with system allocator, i dont see leak with stress-alpine. Could you check?
#![feature(alloc_system, allocator_api)]
extern crate alloc_system;
use alloc_system::System;
#[global_allocator]
static A: System = System;
@neowutran could you try master but with system allocator?
@fafhrd91 already what I am using, unless you mean that you need the "allocator_api"
@Neopallium For the socket not correctly closed that I have, nothing have been registered by the actix middleware. So it never reached it.
Maybe I can activate logs on the firewall level to see what kind of tcp packet is causing the issue
@neowutran yes, "allocator_api". master contains extra timer for ssl handshake, similar to slow request timeout in 0.7.8, I merged it last Friday.
@fafhrd91
Update this morning,
sadly, a lot of new leaks ( from 0.1% of server ram to 12% in ~5 hours)
and at least 1 new bug -> files download cannot complete, it goes to 99% then get stuck indefinitly
https://neowutran.ovh:8080/valgrind_debug5.err
trying again with lastest changes on master
@neowutran could you update from master again.
@fafhrd91
It leak way less memory, and the download bug is fixed
However, after few days with live traffic, I have few hundred unclosed connections - 200+ - ( and memory still leak, but slower )
...
Thu Oct 4 17:57:14 CEST 2018 (303206.00s ago)
XXX.XXX.XX.XXX:41003
Thu Oct 4 17:47:21 CEST 2018 (303800.00s ago)
XXX.XXX.XX.XXX:53984
Thu Oct 4 17:47:21 CEST 2018 (303800.00s ago)
XXX.XXX.XX.XXX:59036
Thu Oct 4 17:47:21 CEST 2018 (303800.00s ago)
XXX.XXX.XX.XXX:57494
Thu Oct 4 17:47:21 CEST 2018 (303801.00s ago)
XXX.XXX.XX.XXX:3255
Thu Oct 4 17:47:21 CEST 2018 (303801.00s ago)
XXX.XXX.XX.XXX:63601
Thu Oct 4 17:47:21 CEST 2018 (303801.00s ago)
XXX.XXX.XX.XXX:57461
Thu Oct 4 17:47:21 CEST 2018 (303801.00s ago)
XXX.XXX.XX.XXX:57460
Thu Oct 4 17:47:21 CEST 2018 (303801.00s ago)
XXX.XXX.XX.XXX:51997
Thu Oct 4 17:50:01 CEST 2018 (303641.00s ago)
XXX.XXX.XX.XXX:60789
Thu Oct 4 17:47:21 CEST 2018 (303802.00s ago)
XXX.XXX.XX.XXX:51863
Thu Oct 4 17:47:21 CEST 2018 (303804.00s ago)
XXX.XXX.XX.XXX:1488
Thu Oct 4 17:47:21 CEST 2018 (303804.00s ago)
XXX.XXX.XX.XXX:58210
Thu Oct 4 17:47:21 CEST 2018 (303804.00s ago)
XXX.XXX.XX.XXX:24316
Thu Oct 4 17:47:21 CEST 2018 (303804.00s ago)
XXX.XXX.XX.XXX:53985
Thu Oct 4 17:47:21 CEST 2018 (303805.00s ago)
XXX.XXX.XX.XXX:49394
Thu Oct 4 17:47:21 CEST 2018 (303805.00s ago)
XXX.XXX.XX.XXX:61586
Thu Oct 4 17:47:21 CEST 2018 (303805.00s ago)
XXX.XXX.XX.XXX:1027
Thu Oct 4 17:47:21 CEST 2018 (303805.00s ago)
XXX.XXX.XX.XXX:2530
Thu Oct 4 17:47:21 CEST 2018 (303806.00s ago)
XXX.XXX.XX.XXX:9384
Thu Oct 4 17:47:21 CEST 2018 (303806.00s ago)
XXX.XXX.XX.XXX:5169
Thu Oct 4 17:47:21 CEST 2018 (303806.00s ago)
XXX.XXX.XX.XXX:49586
Thu Oct 4 17:47:21 CEST 2018 (303807.00s ago)
XXX.XXX.XX.XXX:56157
Thu Oct 4 17:47:21 CEST 2018 (303807.00s ago)
XXX.XXX.XX.XXX:50240
Thu Oct 4 17:57:10 CEST 2018 (303218.00s ago)
XXX.XXX.XX.XXX:59708
Thu Oct 4 17:47:21 CEST 2018 (303808.00s ago)
XXX.XXX.XX.XXX:26615
Thu Oct 4 17:47:21 CEST 2018 (303808.00s ago)
XXX.XXX.XX.XXX:50059
October 4, 17h match with a very high number of users connected to the server. The 200+ unclosed connections are at 17h47 +- few minutes
Same pattern: Multiple unclosed connection at the exact same time
Reached ~500 unclosed connection with live traffic in ~5 days.
some log retrieved with valgrind:
https://neowutran.ovh/valgrind_debug7.err
I needed to do a kill -9 on the process, it was completly stuck
~~ no idea how to help testing this issue
try latest master
Tried latest master, same issue
With live traffic I reached around 300 unclosed connections in 2 days, same "burst" pattern ( a lot of unclosed connection where created at the exact same time )
Valgrind output: https://neowutran.ovh/valgrind_debug8.err
Using a very, very basic implementation with the RUST_LOG environment variable set to debug on the latest version of Actix throws this error on every request, even on a simple plaintext route.
actix_web::server::KeepAlive::Timeout(5) does not send TCP FIN in 0.7.13
code
extern crate actix_web;
extern crate env_logger;
extern crate futures;
#[macro_use]
extern crate log;
extern crate failure;
extern crate tokio_timer;
use actix_web::middleware::{Logger};
use futures::prelude::*;
use actix_web::http::{StatusCode, Method};
use actix_web::{HttpRequest, HttpResponse};
use failure::Error;
fn main(){
::env_logger::init();
let srv = ::actix_web::server::new(move ||{
::actix_web::App::new()
.middleware(Logger::default())
.resource("/", |r|{ r.method(Method::GET).with_async(get_index); })
})
.keep_alive(::actix_web::server::KeepAlive::Timeout(5)) // change here
.bind("0.0.0.0:8081")
.unwrap();
srv.run();
}
/// GET /
pub fn get_index(_: HttpRequest) -> impl Future<Item=HttpResponse, Error=Error> {
::tokio_timer::sleep(::std::time::Duration::from_secs(0)).map(|()|{
HttpResponse::Ok()
.header("content-type", "text/plain; charset=utf-8")
.body("🐲")
}).map_err(Into::into)
}
telnet results, tcpdumps and trace logs
actix_web::server::KeepAlive::Timeout(5)
$ telnet localhost 8081
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GET / HTTP/1.1
HTTP/1.1 200 OK
content-length: 4
content-type: text/plain; charset=utf-8
date: Wed, 31 Oct 2018 12:50:47 GMT
🐲GET / HTTP/1.1
Connection closed by foreign host.
TRACE 2018-10-31T12:50:30Z: actix_web::middleware::logger: Access log format: %a "%r" %s %b "%{Referer}i" "%{User-Agent}i" %T
TRACE 2018-10-31T12:50:30Z: actix_web::middleware::logger: Access log format: %a "%r" %s %b "%{Referer}i" "%{User-Agent}i" %T
TRACE 2018-10-31T12:50:30Z: actix_web::middleware::logger: Access log format: %a "%r" %s %b "%{Referer}i" "%{User-Agent}i" %T
TRACE 2018-10-31T12:50:30Z: actix_web::middleware::logger: Access log format: %a "%r" %s %b "%{Referer}i" "%{User-Agent}i" %T
TRACE 2018-10-31T12:50:30Z: actix_web::middleware::logger: Access log format: %a "%r" %s %b "%{Referer}i" "%{User-Agent}i" %T
TRACE 2018-10-31T12:50:30Z: actix_web::middleware::logger: Access log format: %a "%r" %s %b "%{Referer}i" "%{User-Agent}i" %T
TRACE 2018-10-31T12:50:30Z: actix_web::middleware::logger: Access log format: %a "%r" %s %b "%{Referer}i" "%{User-Agent}i" %T
TRACE 2018-10-31T12:50:30Z: actix_web::middleware::logger: Access log format: %a "%r" %s %b "%{Referer}i" "%{User-Agent}i" %T
INFO 2018-10-31T12:50:47Z: actix_web::middleware::logger: 127.0.0.1:47654 "GET / HTTP/1.1" 200 4 "-" "-" 0.002296
TRACE 2018-10-31T12:50:58Z: actix_web::server::h1: Keep-alive timeout, close connection
$ sudo tcpdump -vvv -s 0 -nX -i lo -r lo-actix-keepalive-timeout-5 port 8081
reading from file lo-actix-keepalive-timeout-5, link-type EN10MB (Ethernet)
21:50:46.369544 IP (tos 0x10, ttl 64, id 35464, offset 0, flags [DF], proto TCP (6), length 60)
127.0.0.1.47654 > 127.0.0.1.8081: Flags [S], cksum 0xfe30 (incorrect -> 0x4601), seq 3217823845, win 43690, options [mss 65495,sackOK,TS val 726696436 ecr 0,nop,wscale 7], length 0
0x0000: 4510 003c 8a88 4000 4006 b221 7f00 0001 E..<..@.@..!....
0x0010: 7f00 0001 ba26 1f91 bfcc 1865 0000 0000 .....&.....e....
0x0020: a002 aaaa fe30 0000 0204 ffd7 0402 080a .....0..........
0x0030: 2b50 81f4 0000 0000 0103 0307 +P..........
21:50:46.369576 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
127.0.0.1.8081 > 127.0.0.1.47654: Flags [S.], cksum 0xfe30 (incorrect -> 0xb220), seq 1341560468, ack 3217823846, win 43690, options [mss 65495,sackOK,TS val 726696436 ecr 726696436,nop,wscale 7], length 0
0x0000: 4500 003c 0000 4000 4006 3cba 7f00 0001 E..<..@.@.<.....
0x0010: 7f00 0001 1f91 ba26 4ff6 9694 bfcc 1866 .......&O......f
0x0020: a012 aaaa fe30 0000 0204 ffd7 0402 080a .....0..........
0x0030: 2b50 81f4 2b50 81f4 0103 0307 +P..+P......
21:50:46.369606 IP (tos 0x10, ttl 64, id 35465, offset 0, flags [DF], proto TCP (6), length 52)
127.0.0.1.47654 > 127.0.0.1.8081: Flags [.], cksum 0xfe28 (incorrect -> 0x8465), seq 1, ack 1, win 342, options [nop,nop,TS val 726696436 ecr 726696436], length 0
0x0000: 4510 0034 8a89 4000 4006 b228 7f00 0001 E..4..@.@..(....
0x0010: 7f00 0001 ba26 1f91 bfcc 1866 4ff6 9695 .....&.....fO...
0x0020: 8010 0156 fe28 0000 0101 080a 2b50 81f4 ...V.(......+P..
0x0030: 2b50 81f4 +P..
21:50:47.156359 IP (tos 0x10, ttl 64, id 35466, offset 0, flags [DF], proto TCP (6), length 68)
127.0.0.1.47654 > 127.0.0.1.8081: Flags [P.], cksum 0xfe38 (incorrect -> 0xafa3), seq 1:17, ack 1, win 342, options [nop,nop,TS val 726697223 ecr 726696436], length 16
0x0000: 4510 0044 8a8a 4000 4006 b217 7f00 0001 E..D..@.@.......
0x0010: 7f00 0001 ba26 1f91 bfcc 1866 4ff6 9695 .....&.....fO...
0x0020: 8018 0156 fe38 0000 0101 080a 2b50 8507 ...V.8......+P..
0x0030: 2b50 81f4 4745 5420 2f20 4854 5450 2f31 +P..GET./.HTTP/1
0x0040: 2e31 0d0a .1..
21:50:47.156414 IP (tos 0x0, ttl 64, id 1144, offset 0, flags [DF], proto TCP (6), length 52)
127.0.0.1.8081 > 127.0.0.1.47654: Flags [.], cksum 0xfe28 (incorrect -> 0x7e2f), seq 1, ack 17, win 342, options [nop,nop,TS val 726697223 ecr 726697223], length 0
0x0000: 4500 0034 0478 4000 4006 384a 7f00 0001 E..4.x@[email protected]....
0x0010: 7f00 0001 1f91 ba26 4ff6 9695 bfcc 1876 .......&O......v
0x0020: 8010 0156 fe28 0000 0101 080a 2b50 8507 ...V.(......+P..
0x0030: 2b50 8507 +P..
21:50:47.951633 IP (tos 0x10, ttl 64, id 35467, offset 0, flags [DF], proto TCP (6), length 54)
127.0.0.1.47654 > 127.0.0.1.8081: Flags [P.], cksum 0xfe2a (incorrect -> 0x6e00), seq 17:19, ack 1, win 342, options [nop,nop,TS val 726698018 ecr 726697223], length 2
0x0000: 4510 0036 8a8b 4000 4006 b224 7f00 0001 E..6..@.@..$....
0x0010: 7f00 0001 ba26 1f91 bfcc 1876 4ff6 9695 .....&.....vO...
0x0020: 8018 0156 fe2a 0000 0101 080a 2b50 8822 ...V.*......+P."
0x0030: 2b50 8507 0d0a +P....
21:50:47.951675 IP (tos 0x0, ttl 64, id 1145, offset 0, flags [DF], proto TCP (6), length 52)
127.0.0.1.8081 > 127.0.0.1.47654: Flags [.], cksum 0xfe28 (incorrect -> 0x77f7), seq 1, ack 19, win 342, options [nop,nop,TS val 726698018 ecr 726698018], length 0
0x0000: 4500 0034 0479 4000 4006 3849 7f00 0001 E..4.y@[email protected]....
0x0010: 7f00 0001 1f91 ba26 4ff6 9695 bfcc 1878 .......&O......x
0x0020: 8010 0156 fe28 0000 0101 080a 2b50 8822 ...V.(......+P."
0x0030: 2b50 8822 +P."
21:50:47.954831 IP (tos 0x0, ttl 64, id 1146, offset 0, flags [DF], proto TCP (6), length 172)
127.0.0.1.8081 > 127.0.0.1.47654: Flags [P.], cksum 0xfea0 (incorrect -> 0x8214), seq 1:121, ack 19, win 342, options [nop,nop,TS val 726698021 ecr 726698018], length 120
0x0000: 4500 00ac 047a 4000 4006 37d0 7f00 0001 E....z@[email protected].....
0x0010: 7f00 0001 1f91 ba26 4ff6 9695 bfcc 1878 .......&O......x
0x0020: 8018 0156 fea0 0000 0101 080a 2b50 8825 ...V........+P.%
0x0030: 2b50 8822 4854 5450 2f31 2e31 2032 3030 +P."HTTP/1.1.200
0x0040: 204f 4b0d 0a63 6f6e 7465 6e74 2d6c 656e .OK..content-len
0x0050: 6774 683a 2034 0d0a 636f 6e74 656e 742d gth:.4..content-
0x0060: 7479 7065 3a20 7465 7874 2f70 6c61 696e type:.text/plain
0x0070: 3b20 6368 6172 7365 743d 7574 662d 380d ;.charset=utf-8.
0x0080: 0a64 6174 653a 2057 6564 2c20 3331 204f .date:.Wed,.31.O
0x0090: 6374 2032 3031 3820 3132 3a35 303a 3437 ct.2018.12:50:47
0x00a0: 2047 4d54 0d0a 0d0a f09f 90b2 .GMT........
21:50:47.954858 IP (tos 0x10, ttl 64, id 35468, offset 0, flags [DF], proto TCP (6), length 52)
127.0.0.1.47654 > 127.0.0.1.8081: Flags [.], cksum 0xfe28 (incorrect -> 0x7779), seq 19, ack 121, win 342, options [nop,nop,TS val 726698021 ecr 726698021], length 0
0x0000: 4510 0034 8a8c 4000 4006 b225 7f00 0001 E..4..@.@..%....
0x0010: 7f00 0001 ba26 1f91 bfcc 1878 4ff6 970d .....&.....xO...
0x0020: 8010 0156 fe28 0000 0101 080a 2b50 8825 ...V.(......+P.%
0x0030: 2b50 8825 +P.%
21:50:58.624055 IP (tos 0x10, ttl 64, id 35469, offset 0, flags [DF], proto TCP (6), length 68)
127.0.0.1.47654 > 127.0.0.1.8081: Flags [P.], cksum 0xfe38 (incorrect -> 0x7c1d), seq 19:35, ack 121, win 342, options [nop,nop,TS val 726708690 ecr 726698021], length 16
0x0000: 4510 0044 8a8d 4000 4006 b214 7f00 0001 E..D..@.@.......
0x0010: 7f00 0001 ba26 1f91 bfcc 1878 4ff6 970d .....&.....xO...
0x0020: 8018 0156 fe38 0000 0101 080a 2b50 b1d2 ...V.8......+P..
0x0030: 2b50 8825 4745 5420 2f20 4854 5450 2f31 +P.%GET./.HTTP/1
0x0040: 2e31 0d0a .1..
21:50:58.624692 IP (tos 0x0, ttl 64, id 1147, offset 0, flags [DF], proto TCP (6), length 52)
127.0.0.1.8081 > 127.0.0.1.47654: Flags [R.], cksum 0xfe28 (incorrect -> 0x240a), seq 121, ack 35, win 342, options [nop,nop,TS val 726708691 ecr 726708690], length 0
0x0000: 4500 0034 047b 4000 4006 3847 7f00 0001 E..4.{@[email protected]....
0x0010: 7f00 0001 1f91 ba26 4ff6 970d bfcc 1888 .......&O.......
0x0020: 8014 0156 fe28 0000 0101 080a 2b50 b1d3 ...V.(......+P..
0x0030: 2b50 b1d2 +P..
actix_web::server::KeepAlive::Os
$ telnet localhost 8081
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GET / HTTP/1.1
HTTP/1.1 200 OK
content-length: 4
content-type: text/plain; charset=utf-8
date: Wed, 31 Oct 2018 12:51:38 GMT
🐲Connection closed by foreign host.
TRACE 2018-10-31T12:51:34Z: actix_web::middleware::logger: Access log format: %a "%r" %s %b "%{Referer}i" "%{User-Agent}i" %T
TRACE 2018-10-31T12:51:34Z: actix_web::middleware::logger: Access log format: %a "%r" %s %b "%{Referer}i" "%{User-Agent}i" %T
TRACE 2018-10-31T12:51:34Z: actix_web::middleware::logger: Access log format: %a "%r" %s %b "%{Referer}i" "%{User-Agent}i" %T
TRACE 2018-10-31T12:51:34Z: actix_web::middleware::logger: Access log format: %a "%r" %s %b "%{Referer}i" "%{User-Agent}i" %T
TRACE 2018-10-31T12:51:34Z: actix_web::middleware::logger: Access log format: %a "%r" %s %b "%{Referer}i" "%{User-Agent}i" %T
TRACE 2018-10-31T12:51:34Z: actix_web::middleware::logger: Access log format: %a "%r" %s %b "%{Referer}i" "%{User-Agent}i" %T
TRACE 2018-10-31T12:51:34Z: actix_web::middleware::logger: Access log format: %a "%r" %s %b "%{Referer}i" "%{User-Agent}i" %T
TRACE 2018-10-31T12:51:34Z: actix_web::middleware::logger: Access log format: %a "%r" %s %b "%{Referer}i" "%{User-Agent}i" %T
INFO 2018-10-31T12:51:38Z: actix_web::middleware::logger: 127.0.0.1:47666 "GET / HTTP/1.1" 200 4 "-" "-" 0.002420
TRACE 2018-10-31T12:51:41Z: actix_web::server::h1: Keep-alive timeout, close connection
$ sudo tcpdump -vvv -s 0 -nX -i lo -r lo-actix-keepalive-os port 8081
reading from file lo-actix-keepalive-os, link-type EN10MB (Ethernet)
21:51:36.052822 IP (tos 0x10, ttl 64, id 46588, offset 0, flags [DF], proto TCP (6), length 60)
127.0.0.1.47666 > 127.0.0.1.8081: Flags [S], cksum 0xfe30 (incorrect -> 0x88c7), seq 1555920526, win 43690, options [mss 65495,sackOK,TS val 726746119 ecr 0,nop,wscale 7], length 0
0x0000: 4510 003c b5fc 4000 4006 86ad 7f00 0001 E..<..@.@.......
0x0010: 7f00 0001 ba32 1f91 5cbd 768e 0000 0000 .....2..\.v.....
0x0020: a002 aaaa fe30 0000 0204 ffd7 0402 080a .....0..........
0x0030: 2b51 4407 0000 0000 0103 0307 +QD.........
21:51:36.052854 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
127.0.0.1.8081 > 127.0.0.1.47666: Flags [S.], cksum 0xfe30 (incorrect -> 0x4f6b), seq 978751388, ack 1555920527, win 43690, options [mss 65495,sackOK,TS val 726746119 ecr 726746119,nop,wscale 7], length 0
0x0000: 4500 003c 0000 4000 4006 3cba 7f00 0001 E..<..@.@.<.....
0x0010: 7f00 0001 1f91 ba32 3a56 8f9c 5cbd 768f .......2:V..\.v.
0x0020: a012 aaaa fe30 0000 0204 ffd7 0402 080a .....0..........
0x0030: 2b51 4407 2b51 4407 0103 0307 +QD.+QD.....
21:51:36.052883 IP (tos 0x10, ttl 64, id 46589, offset 0, flags [DF], proto TCP (6), length 52)
127.0.0.1.47666 > 127.0.0.1.8081: Flags [.], cksum 0xfe28 (incorrect -> 0x21b0), seq 1, ack 1, win 342, options [nop,nop,TS val 726746119 ecr 726746119], length 0
0x0000: 4510 0034 b5fd 4000 4006 86b4 7f00 0001 E..4..@.@.......
0x0010: 7f00 0001 ba32 1f91 5cbd 768f 3a56 8f9d .....2..\.v.:V..
0x0020: 8010 0156 fe28 0000 0101 080a 2b51 4407 ...V.(......+QD.
0x0030: 2b51 4407 +QD.
21:51:36.719542 IP (tos 0x10, ttl 64, id 46590, offset 0, flags [DF], proto TCP (6), length 68)
127.0.0.1.47666 > 127.0.0.1.8081: Flags [P.], cksum 0xfe38 (incorrect -> 0x4d66), seq 1:17, ack 1, win 342, options [nop,nop,TS val 726746786 ecr 726746119], length 16
0x0000: 4510 0044 b5fe 4000 4006 86a3 7f00 0001 E..D..@.@.......
0x0010: 7f00 0001 ba32 1f91 5cbd 768f 3a56 8f9d .....2..\.v.:V..
0x0020: 8018 0156 fe38 0000 0101 080a 2b51 46a2 ...V.8......+QF.
0x0030: 2b51 4407 4745 5420 2f20 4854 5450 2f31 +QD.GET./.HTTP/1
0x0040: 2e31 0d0a .1..
21:51:36.719593 IP (tos 0x0, ttl 64, id 19695, offset 0, flags [DF], proto TCP (6), length 52)
127.0.0.1.8081 > 127.0.0.1.47666: Flags [.], cksum 0xfe28 (incorrect -> 0x1c6a), seq 1, ack 17, win 342, options [nop,nop,TS val 726746786 ecr 726746786], length 0
0x0000: 4500 0034 4cef 4000 4006 efd2 7f00 0001 E..4L.@.@.......
0x0010: 7f00 0001 1f91 ba32 3a56 8f9d 5cbd 769f .......2:V..\.v.
0x0020: 8010 0156 fe28 0000 0101 080a 2b51 46a2 ...V.(......+QF.
0x0030: 2b51 46a2 +QF.
21:51:38.218838 IP (tos 0x10, ttl 64, id 46591, offset 0, flags [DF], proto TCP (6), length 54)
127.0.0.1.47666 > 127.0.0.1.8081: Flags [P.], cksum 0xfe2a (incorrect -> 0x097b), seq 17:19, ack 1, win 342, options [nop,nop,TS val 726748285 ecr 726746786], length 2
0x0000: 4510 0036 b5ff 4000 4006 86b0 7f00 0001 E..6..@.@.......
0x0010: 7f00 0001 ba32 1f91 5cbd 769f 3a56 8f9d .....2..\.v.:V..
0x0020: 8018 0156 fe2a 0000 0101 080a 2b51 4c7d ...V.*......+QL}
0x0030: 2b51 46a2 0d0a +QF...
21:51:38.218870 IP (tos 0x0, ttl 64, id 19696, offset 0, flags [DF], proto TCP (6), length 52)
127.0.0.1.8081 > 127.0.0.1.47666: Flags [.], cksum 0xfe28 (incorrect -> 0x10b2), seq 1, ack 19, win 342, options [nop,nop,TS val 726748285 ecr 726748285], length 0
0x0000: 4500 0034 4cf0 4000 4006 efd1 7f00 0001 E..4L.@.@.......
0x0010: 7f00 0001 1f91 ba32 3a56 8f9d 5cbd 76a1 .......2:V..\.v.
0x0020: 8010 0156 fe28 0000 0101 080a 2b51 4c7d ...V.(......+QL}
0x0030: 2b51 4c7d +QL}
21:51:38.222134 IP (tos 0x0, ttl 64, id 19697, offset 0, flags [DF], proto TCP (6), length 172)
127.0.0.1.8081 > 127.0.0.1.47666: Flags [P.], cksum 0xfea0 (incorrect -> 0x1ace), seq 1:121, ack 19, win 342, options [nop,nop,TS val 726748288 ecr 726748285], length 120
0x0000: 4500 00ac 4cf1 4000 4006 ef58 7f00 0001 E...L.@[email protected]....
0x0010: 7f00 0001 1f91 ba32 3a56 8f9d 5cbd 76a1 .......2:V..\.v.
0x0020: 8018 0156 fea0 0000 0101 080a 2b51 4c80 ...V........+QL.
0x0030: 2b51 4c7d 4854 5450 2f31 2e31 2032 3030 +QL}HTTP/1.1.200
0x0040: 204f 4b0d 0a63 6f6e 7465 6e74 2d6c 656e .OK..content-len
0x0050: 6774 683a 2034 0d0a 636f 6e74 656e 742d gth:.4..content-
0x0060: 7479 7065 3a20 7465 7874 2f70 6c61 696e type:.text/plain
0x0070: 3b20 6368 6172 7365 743d 7574 662d 380d ;.charset=utf-8.
0x0080: 0a64 6174 653a 2057 6564 2c20 3331 204f .date:.Wed,.31.O
0x0090: 6374 2032 3031 3820 3132 3a35 313a 3338 ct.2018.12:51:38
0x00a0: 2047 4d54 0d0a 0d0a f09f 90b2 .GMT........
21:51:38.222169 IP (tos 0x10, ttl 64, id 46592, offset 0, flags [DF], proto TCP (6), length 52)
127.0.0.1.47666 > 127.0.0.1.8081: Flags [.], cksum 0xfe28 (incorrect -> 0x1034), seq 19, ack 121, win 342, options [nop,nop,TS val 726748288 ecr 726748288], length 0
0x0000: 4510 0034 b600 4000 4006 86b1 7f00 0001 E..4..@.@.......
0x0010: 7f00 0001 ba32 1f91 5cbd 76a1 3a56 9015 .....2..\.v.:V..
0x0020: 8010 0156 fe28 0000 0101 080a 2b51 4c80 ...V.(......+QL.
0x0030: 2b51 4c80 +QL.
21:51:41.055305 IP (tos 0x0, ttl 64, id 19698, offset 0, flags [DF], proto TCP (6), length 52)
127.0.0.1.8081 > 127.0.0.1.47666: Flags [F.], cksum 0xfe28 (incorrect -> 0x0521), seq 121, ack 19, win 342, options [nop,nop,TS val 726751122 ecr 726748288], length 0
0x0000: 4500 0034 4cf2 4000 4006 efcf 7f00 0001 E..4L.@.@.......
0x0010: 7f00 0001 1f91 ba32 3a56 9015 5cbd 76a1 .......2:V..\.v.
0x0020: 8011 0156 fe28 0000 0101 080a 2b51 5792 ...V.(......+QW.
0x0030: 2b51 4c80 +QL.
21:51:41.055549 IP (tos 0x10, ttl 64, id 46593, offset 0, flags [DF], proto TCP (6), length 52)
127.0.0.1.47666 > 127.0.0.1.8081: Flags [F.], cksum 0xfe28 (incorrect -> 0xfa0d), seq 19, ack 122, win 342, options [nop,nop,TS val 726751122 ecr 726751122], length 0
0x0000: 4510 0034 b601 4000 4006 86b0 7f00 0001 E..4..@.@.......
0x0010: 7f00 0001 ba32 1f91 5cbd 76a1 3a56 9016 .....2..\.v.:V..
0x0020: 8011 0156 fe28 0000 0101 080a 2b51 5792 ...V.(......+QW.
0x0030: 2b51 5792 +QW.
21:51:41.055601 IP (tos 0x0, ttl 64, id 19699, offset 0, flags [DF], proto TCP (6), length 52)
127.0.0.1.8081 > 127.0.0.1.47666: Flags [.], cksum 0xfe28 (incorrect -> 0xfa0d), seq 122, ack 20, win 342, options [nop,nop,TS val 726751122 ecr 726751122], length 0
0x0000: 4500 0034 4cf3 4000 4006 efce 7f00 0001 E..4L.@.@.......
0x0010: 7f00 0001 1f91 ba32 3a56 9016 5cbd 76a2 .......2:V..\.v.
0x0020: 8010 0156 fe28 0000 0101 080a 2b51 5792 ...V.(......+QW.
0x0030: 2b51 5792 +QW.
actix_web::server::KeepAlive::Disabled (without Conection: close)
$ telnet localhost 8081
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GET / HTTP/1.1
HTTP/1.1 200 OK
content-length: 4
content-type: text/plain; charset=utf-8
date: Wed, 31 Oct 2018 12:52:45 GMT
🐲Connection closed by foreign host.
TRACE 2018-10-31T12:52:40Z: actix_web::middleware::logger: Access log format: %a "%r" %s %b "%{Referer}i" "%{User-Agent}i" %T
TRACE 2018-10-31T12:52:40Z: actix_web::middleware::logger: Access log format: %a "%r" %s %b "%{Referer}i" "%{User-Agent}i" %T
TRACE 2018-10-31T12:52:40Z: actix_web::middleware::logger: Access log format: %a "%r" %s %b "%{Referer}i" "%{User-Agent}i" %T
TRACE 2018-10-31T12:52:40Z: actix_web::middleware::logger: Access log format: %a "%r" %s %b "%{Referer}i" "%{User-Agent}i" %T
TRACE 2018-10-31T12:52:40Z: actix_web::middleware::logger: Access log format: %a "%r" %s %b "%{Referer}i" "%{User-Agent}i" %T
TRACE 2018-10-31T12:52:40Z: actix_web::middleware::logger: Access log format: %a "%r" %s %b "%{Referer}i" "%{User-Agent}i" %T
TRACE 2018-10-31T12:52:40Z: actix_web::middleware::logger: Access log format: %a "%r" %s %b "%{Referer}i" "%{User-Agent}i" %T
TRACE 2018-10-31T12:52:40Z: actix_web::middleware::logger: Access log format: %a "%r" %s %b "%{Referer}i" "%{User-Agent}i" %T
INFO 2018-10-31T12:52:45Z: actix_web::middleware::logger: 127.0.0.1:47682 "GET / HTTP/1.1" 200 4 "-" "-" 0.001000
$ sudo tcpdump -vvv -s 0 -nX -i lo -r lo-actix-keepalive-disabled port 8081
reading from file lo-actix-keepalive-disabled, link-type EN10MB (Ethernet)
21:52:43.979630 IP (tos 0x10, ttl 64, id 16549, offset 0, flags [DF], proto TCP (6), length 60)
127.0.0.1.47682 > 127.0.0.1.8081: Flags [S], cksum 0xfe30 (incorrect -> 0xe2ba), seq 713246061, win 43690, options [mss 65495,sackOK,TS val 726814046 ecr 0,nop,wscale 7], length 0
0x0000: 4510 003c 40a5 4000 4006 fc04 7f00 0001 E..<@.@.@.......
0x0010: 7f00 0001 ba42 1f91 2a83 456d 0000 0000 .....B..*.Em....
0x0020: a002 aaaa fe30 0000 0204 ffd7 0402 080a .....0..........
0x0030: 2b52 4d5e 0000 0000 0103 0307 +RM^........
21:52:43.979671 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
127.0.0.1.8081 > 127.0.0.1.47682: Flags [S.], cksum 0xfe30 (incorrect -> 0xf16a), seq 3296441362, ack 713246062, win 43690, options [mss 65495,sackOK,TS val 726814046 ecr 726814046,nop,wscale 7], length 0
0x0000: 4500 003c 0000 4000 4006 3cba 7f00 0001 E..<..@.@.<.....
0x0010: 7f00 0001 1f91 ba42 c47b b412 2a83 456e .......B.{..*.En
0x0020: a012 aaaa fe30 0000 0204 ffd7 0402 080a .....0..........
0x0030: 2b52 4d5e 2b52 4d5e 0103 0307 +RM^+RM^....
21:52:43.979701 IP (tos 0x10, ttl 64, id 16550, offset 0, flags [DF], proto TCP (6), length 52)
127.0.0.1.47682 > 127.0.0.1.8081: Flags [.], cksum 0xfe28 (incorrect -> 0xc3af), seq 1, ack 1, win 342, options [nop,nop,TS val 726814046 ecr 726814046], length 0
0x0000: 4510 0034 40a6 4000 4006 fc0b 7f00 0001 E..4@.@.@.......
0x0010: 7f00 0001 ba42 1f91 2a83 456e c47b b413 .....B..*.En.{..
0x0020: 8010 0156 fe28 0000 0101 080a 2b52 4d5e ...V.(......+RM^
0x0030: 2b52 4d5e +RM^
21:52:44.841175 IP (tos 0x10, ttl 64, id 16551, offset 0, flags [DF], proto TCP (6), length 68)
127.0.0.1.47682 > 127.0.0.1.8081: Flags [P.], cksum 0xfe38 (incorrect -> 0xeea3), seq 1:17, ack 1, win 342, options [nop,nop,TS val 726814907 ecr 726814046], length 16
0x0000: 4510 0044 40a7 4000 4006 fbfa 7f00 0001 E..D@.@.@.......
0x0010: 7f00 0001 ba42 1f91 2a83 456e c47b b413 .....B..*.En.{..
0x0020: 8018 0156 fe38 0000 0101 080a 2b52 50bb ...V.8......+RP.
0x0030: 2b52 4d5e 4745 5420 2f20 4854 5450 2f31 +RM^GET./.HTTP/1
0x0040: 2e31 0d0a .1..
21:52:44.841216 IP (tos 0x0, ttl 64, id 45531, offset 0, flags [DF], proto TCP (6), length 52)
127.0.0.1.8081 > 127.0.0.1.47682: Flags [.], cksum 0xfe28 (incorrect -> 0xbce5), seq 1, ack 17, win 342, options [nop,nop,TS val 726814907 ecr 726814907], length 0
0x0000: 4500 0034 b1db 4000 4006 8ae6 7f00 0001 E..4..@.@.......
0x0010: 7f00 0001 1f91 ba42 c47b b413 2a83 457e .......B.{..*.E~
0x0020: 8010 0156 fe28 0000 0101 080a 2b52 50bb ...V.(......+RP.
0x0030: 2b52 50bb +RP.
21:52:45.537356 IP (tos 0x10, ttl 64, id 16552, offset 0, flags [DF], proto TCP (6), length 54)
127.0.0.1.47682 > 127.0.0.1.8081: Flags [P.], cksum 0xfe2a (incorrect -> 0xad18), seq 17:19, ack 1, win 342, options [nop,nop,TS val 726815604 ecr 726814907], length 2
0x0000: 4510 0036 40a8 4000 4006 fc07 7f00 0001 E..6@.@.@.......
0x0010: 7f00 0001 ba42 1f91 2a83 457e c47b b413 .....B..*.E~.{..
0x0020: 8018 0156 fe2a 0000 0101 080a 2b52 5374 ...V.*......+RSt
0x0030: 2b52 50bb 0d0a +RP...
21:52:45.537396 IP (tos 0x0, ttl 64, id 45532, offset 0, flags [DF], proto TCP (6), length 52)
127.0.0.1.8081 > 127.0.0.1.47682: Flags [.], cksum 0xfe28 (incorrect -> 0xb771), seq 1, ack 19, win 342, options [nop,nop,TS val 726815604 ecr 726815604], length 0
0x0000: 4500 0034 b1dc 4000 4006 8ae5 7f00 0001 E..4..@.@.......
0x0010: 7f00 0001 1f91 ba42 c47b b413 2a83 4580 .......B.{..*.E.
0x0020: 8010 0156 fe28 0000 0101 080a 2b52 5374 ...V.(......+RSt
0x0030: 2b52 5374 +RSt
21:52:45.539239 IP (tos 0x0, ttl 64, id 45533, offset 0, flags [DF], proto TCP (6), length 172)
127.0.0.1.8081 > 127.0.0.1.47682: Flags [P.], cksum 0xfea0 (incorrect -> 0xbf92), seq 1:121, ack 19, win 342, options [nop,nop,TS val 726815605 ecr 726815604], length 120
0x0000: 4500 00ac b1dd 4000 4006 8a6c 7f00 0001 E.....@[email protected]....
0x0010: 7f00 0001 1f91 ba42 c47b b413 2a83 4580 .......B.{..*.E.
0x0020: 8018 0156 fea0 0000 0101 080a 2b52 5375 ...V........+RSu
0x0030: 2b52 5374 4854 5450 2f31 2e31 2032 3030 +RStHTTP/1.1.200
0x0040: 204f 4b0d 0a63 6f6e 7465 6e74 2d6c 656e .OK..content-len
0x0050: 6774 683a 2034 0d0a 636f 6e74 656e 742d gth:.4..content-
0x0060: 7479 7065 3a20 7465 7874 2f70 6c61 696e type:.text/plain
0x0070: 3b20 6368 6172 7365 743d 7574 662d 380d ;.charset=utf-8.
0x0080: 0a64 6174 653a 2057 6564 2c20 3331 204f .date:.Wed,.31.O
0x0090: 6374 2032 3031 3820 3132 3a35 323a 3435 ct.2018.12:52:45
0x00a0: 2047 4d54 0d0a 0d0a f09f 90b2 .GMT........
21:52:45.539267 IP (tos 0x10, ttl 64, id 16553, offset 0, flags [DF], proto TCP (6), length 52)
127.0.0.1.47682 > 127.0.0.1.8081: Flags [.], cksum 0xfe28 (incorrect -> 0xb6f6), seq 19, ack 121, win 342, options [nop,nop,TS val 726815606 ecr 726815605], length 0
0x0000: 4510 0034 40a9 4000 4006 fc08 7f00 0001 E..4@.@.@.......
0x0010: 7f00 0001 ba42 1f91 2a83 4580 c47b b48b .....B..*.E..{..
0x0020: 8010 0156 fe28 0000 0101 080a 2b52 5376 ...V.(......+RSv
0x0030: 2b52 5375 +RSu
21:52:45.539433 IP (tos 0x0, ttl 64, id 45534, offset 0, flags [DF], proto TCP (6), length 52)
127.0.0.1.8081 > 127.0.0.1.47682: Flags [F.], cksum 0xfe28 (incorrect -> 0xb6f4), seq 121, ack 19, win 342, options [nop,nop,TS val 726815606 ecr 726815606], length 0
0x0000: 4500 0034 b1de 4000 4006 8ae3 7f00 0001 E..4..@.@.......
0x0010: 7f00 0001 1f91 ba42 c47b b48b 2a83 4580 .......B.{..*.E.
0x0020: 8011 0156 fe28 0000 0101 080a 2b52 5376 ...V.(......+RSv
0x0030: 2b52 5376 +RSv
21:52:45.539609 IP (tos 0x10, ttl 64, id 16554, offset 0, flags [DF], proto TCP (6), length 52)
127.0.0.1.47682 > 127.0.0.1.8081: Flags [F.], cksum 0xfe28 (incorrect -> 0xb6f3), seq 19, ack 122, win 342, options [nop,nop,TS val 726815606 ecr 726815606], length 0
0x0000: 4510 0034 40aa 4000 4006 fc07 7f00 0001 E..4@.@.@.......
0x0010: 7f00 0001 ba42 1f91 2a83 4580 c47b b48c .....B..*.E..{..
0x0020: 8011 0156 fe28 0000 0101 080a 2b52 5376 ...V.(......+RSv
0x0030: 2b52 5376 +RSv
21:52:45.539652 IP (tos 0x0, ttl 64, id 45535, offset 0, flags [DF], proto TCP (6), length 52)
127.0.0.1.8081 > 127.0.0.1.47682: Flags [.], cksum 0xfe28 (incorrect -> 0xb6f3), seq 122, ack 20, win 342, options [nop,nop,TS val 726815606 ecr 726815606], length 0
0x0000: 4500 0034 b1df 4000 4006 8ae2 7f00 0001 E..4..@.@.......
0x0010: 7f00 0001 1f91 ba42 c47b b48c 2a83 4581 .......B.{..*.E.
0x0020: 8010 0156 fe28 0000 0101 080a 2b52 5376 ...V.(......+RSv
0x0030: 2b52 5376 +RSv
actix_web::server::KeepAlive::Tcp(5) (same as OS)
$ telnet localhost 8081
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GET / HTTP/1.1
HTTP/1.1 200 OK
content-length: 4
content-type: text/plain; charset=utf-8
date: Wed, 31 Oct 2018 12:59:47 GMT
🐲Connection closed by foreign host.
TRACE 2018-10-31T12:59:43Z: actix_web::middleware::logger: Access log format: %a "%r" %s %b "%{Referer}i" "%{User-Agent}i" %T
TRACE 2018-10-31T12:59:43Z: actix_web::middleware::logger: Access log format: %a "%r" %s %b "%{Referer}i" "%{User-Agent}i" %T
TRACE 2018-10-31T12:59:43Z: actix_web::middleware::logger: Access log format: %a "%r" %s %b "%{Referer}i" "%{User-Agent}i" %T
TRACE 2018-10-31T12:59:43Z: actix_web::middleware::logger: Access log format: %a "%r" %s %b "%{Referer}i" "%{User-Agent}i" %T
TRACE 2018-10-31T12:59:43Z: actix_web::middleware::logger: Access log format: %a "%r" %s %b "%{Referer}i" "%{User-Agent}i" %T
TRACE 2018-10-31T12:59:43Z: actix_web::middleware::logger: Access log format: %a "%r" %s %b "%{Referer}i" "%{User-Agent}i" %T
TRACE 2018-10-31T12:59:43Z: actix_web::middleware::logger: Access log format: %a "%r" %s %b "%{Referer}i" "%{User-Agent}i" %T
TRACE 2018-10-31T12:59:43Z: actix_web::middleware::logger: Access log format: %a "%r" %s %b "%{Referer}i" "%{User-Agent}i" %T
INFO 2018-10-31T12:59:47Z: actix_web::middleware::logger: 127.0.0.1:47778 "GET / HTTP/1.1" 200 4 "-" "-" 0.002062
TRACE 2018-10-31T12:59:50Z: actix_web::server::h1: Keep-alive timeout, close connection
$ sudo tcpdump -vvv -s 0 -nX -i lo -r lo-actix-keepalive-tcp-5 port 8081
reading from file lo-actix-keepalive-tcp-5, link-type EN10MB (Ethernet)
21:59:45.000307 IP (tos 0x10, ttl 64, id 51301, offset 0, flags [DF], proto TCP (6), length 60)
127.0.0.1.47778 > 127.0.0.1.8081: Flags [S], cksum 0xfe30 (incorrect -> 0x1584), seq 1738107018, win 43690, options [mss 65495,sackOK,TS val 727235067 ecr 0,nop,wscale 7], length 0
0x0000: 4510 003c c865 4000 4006 7444 7f00 0001 E..<.e@[email protected]....
0x0010: 7f00 0001 baa2 1f91 6799 688a 0000 0000 ........g.h.....
0x0020: a002 aaaa fe30 0000 0204 ffd7 0402 080a .....0..........
0x0030: 2b58 b9fb 0000 0000 0103 0307 +X..........
21:59:45.000339 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
127.0.0.1.8081 > 127.0.0.1.47778: Flags [S.], cksum 0xfe30 (incorrect -> 0x2ee8), seq 45284996, ack 1738107019, win 43690, options [mss 65495,sackOK,TS val 727235067 ecr 727235067,nop,wscale 7], length 0
0x0000: 4500 003c 0000 4000 4006 3cba 7f00 0001 E..<..@.@.<.....
0x0010: 7f00 0001 1f91 baa2 02b2 fe84 6799 688b ............g.h.
0x0020: a012 aaaa fe30 0000 0204 ffd7 0402 080a .....0..........
0x0030: 2b58 b9fb 2b58 b9fb 0103 0307 +X..+X......
21:59:45.000369 IP (tos 0x10, ttl 64, id 51302, offset 0, flags [DF], proto TCP (6), length 52)
127.0.0.1.47778 > 127.0.0.1.8081: Flags [.], cksum 0xfe28 (incorrect -> 0x012d), seq 1, ack 1, win 342, options [nop,nop,TS val 727235067 ecr 727235067], length 0
0x0000: 4510 0034 c866 4000 4006 744b 7f00 0001 E..4.f@[email protected]....
0x0010: 7f00 0001 baa2 1f91 6799 688b 02b2 fe85 ........g.h.....
0x0020: 8010 0156 fe28 0000 0101 080a 2b58 b9fb ...V.(......+X..
0x0030: 2b58 b9fb +X..
21:59:46.326087 IP (tos 0x10, ttl 64, id 51303, offset 0, flags [DF], proto TCP (6), length 68)
127.0.0.1.47778 > 127.0.0.1.8081: Flags [P.], cksum 0xfe38 (incorrect -> 0x2a51), seq 1:17, ack 1, win 342, options [nop,nop,TS val 727236392 ecr 727235067], length 16
0x0000: 4510 0044 c867 4000 4006 743a 7f00 0001 E..D.g@[email protected]:....
0x0010: 7f00 0001 baa2 1f91 6799 688b 02b2 fe85 ........g.h.....
0x0020: 8018 0156 fe38 0000 0101 080a 2b58 bf28 ...V.8......+X.(
0x0030: 2b58 b9fb 4745 5420 2f20 4854 5450 2f31 +X..GET./.HTTP/1
0x0040: 2e31 0d0a .1..
21:59:46.326125 IP (tos 0x0, ttl 64, id 24170, offset 0, flags [DF], proto TCP (6), length 52)
127.0.0.1.8081 > 127.0.0.1.47778: Flags [.], cksum 0xfe28 (incorrect -> 0xf6c1), seq 1, ack 17, win 342, options [nop,nop,TS val 727236393 ecr 727236392], length 0
0x0000: 4500 0034 5e6a 4000 4006 de57 7f00 0001 E..4^j@[email protected]....
0x0010: 7f00 0001 1f91 baa2 02b2 fe85 6799 689b ............g.h.
0x0020: 8010 0156 fe28 0000 0101 080a 2b58 bf29 ...V.(......+X.)
0x0030: 2b58 bf28 +X.(
21:59:47.128546 IP (tos 0x10, ttl 64, id 51304, offset 0, flags [DF], proto TCP (6), length 54)
127.0.0.1.47778 > 127.0.0.1.8081: Flags [P.], cksum 0xfe2a (incorrect -> 0xe68a), seq 17:19, ack 1, win 342, options [nop,nop,TS val 727237195 ecr 727236393], length 2
0x0000: 4510 0036 c868 4000 4006 7447 7f00 0001 E..6.h@[email protected]....
0x0010: 7f00 0001 baa2 1f91 6799 689b 02b2 fe85 ........g.h.....
0x0020: 8018 0156 fe2a 0000 0101 080a 2b58 c24b ...V.*......+X.K
0x0030: 2b58 bf29 0d0a +X.)..
21:59:47.128591 IP (tos 0x0, ttl 64, id 24171, offset 0, flags [DF], proto TCP (6), length 52)
127.0.0.1.8081 > 127.0.0.1.47778: Flags [.], cksum 0xfe28 (incorrect -> 0xf07a), seq 1, ack 19, win 342, options [nop,nop,TS val 727237195 ecr 727237195], length 0
0x0000: 4500 0034 5e6b 4000 4006 de56 7f00 0001 E..4^k@[email protected]....
0x0010: 7f00 0001 1f91 baa2 02b2 fe85 6799 689d ............g.h.
0x0020: 8010 0156 fe28 0000 0101 080a 2b58 c24b ...V.(......+X.K
0x0030: 2b58 c24b +X.K
21:59:47.131402 IP (tos 0x0, ttl 64, id 24172, offset 0, flags [DF], proto TCP (6), length 172)
127.0.0.1.8081 > 127.0.0.1.47778: Flags [P.], cksum 0xfea0 (incorrect -> 0xf197), seq 1:121, ack 19, win 342, options [nop,nop,TS val 727237198 ecr 727237195], length 120
0x0000: 4500 00ac 5e6c 4000 4006 dddd 7f00 0001 E...^l@.@.......
0x0010: 7f00 0001 1f91 baa2 02b2 fe85 6799 689d ............g.h.
0x0020: 8018 0156 fea0 0000 0101 080a 2b58 c24e ...V........+X.N
0x0030: 2b58 c24b 4854 5450 2f31 2e31 2032 3030 +X.KHTTP/1.1.200
0x0040: 204f 4b0d 0a63 6f6e 7465 6e74 2d6c 656e .OK..content-len
0x0050: 6774 683a 2034 0d0a 636f 6e74 656e 742d gth:.4..content-
0x0060: 7479 7065 3a20 7465 7874 2f70 6c61 696e type:.text/plain
0x0070: 3b20 6368 6172 7365 743d 7574 662d 380d ;.charset=utf-8.
0x0080: 0a64 6174 653a 2057 6564 2c20 3331 204f .date:.Wed,.31.O
0x0090: 6374 2032 3031 3820 3132 3a35 393a 3437 ct.2018.12:59:47
0x00a0: 2047 4d54 0d0a 0d0a f09f 90b2 .GMT........
21:59:47.131449 IP (tos 0x10, ttl 64, id 51305, offset 0, flags [DF], proto TCP (6), length 52)
127.0.0.1.47778 > 127.0.0.1.8081: Flags [.], cksum 0xfe28 (incorrect -> 0xeffc), seq 19, ack 121, win 342, options [nop,nop,TS val 727237198 ecr 727237198], length 0
0x0000: 4510 0034 c869 4000 4006 7448 7f00 0001 E..4.i@[email protected]....
0x0010: 7f00 0001 baa2 1f91 6799 689d 02b2 fefd ........g.h.....
0x0020: 8010 0156 fe28 0000 0101 080a 2b58 c24e ...V.(......+X.N
0x0030: 2b58 c24e +X.N
21:59:50.001987 IP (tos 0x0, ttl 64, id 24173, offset 0, flags [DF], proto TCP (6), length 52)
127.0.0.1.8081 > 127.0.0.1.47778: Flags [F.], cksum 0xfe28 (incorrect -> 0xe4c5), seq 121, ack 19, win 342, options [nop,nop,TS val 727240068 ecr 727237198], length 0
0x0000: 4500 0034 5e6d 4000 4006 de54 7f00 0001 E..4^m@[email protected]....
0x0010: 7f00 0001 1f91 baa2 02b2 fefd 6799 689d ............g.h.
0x0020: 8011 0156 fe28 0000 0101 080a 2b58 cd84 ...V.(......+X..
0x0030: 2b58 c24e +X.N
21:59:50.002168 IP (tos 0x10, ttl 64, id 51306, offset 0, flags [DF], proto TCP (6), length 52)
127.0.0.1.47778 > 127.0.0.1.8081: Flags [F.], cksum 0xfe28 (incorrect -> 0xd98d), seq 19, ack 122, win 342, options [nop,nop,TS val 727240069 ecr 727240068], length 0
0x0000: 4510 0034 c86a 4000 4006 7447 7f00 0001 E..4.j@[email protected]....
0x0010: 7f00 0001 baa2 1f91 6799 689d 02b2 fefe ........g.h.....
0x0020: 8011 0156 fe28 0000 0101 080a 2b58 cd85 ...V.(......+X..
0x0030: 2b58 cd84 +X..
21:59:50.002225 IP (tos 0x0, ttl 64, id 24174, offset 0, flags [DF], proto TCP (6), length 52)
127.0.0.1.8081 > 127.0.0.1.47778: Flags [.], cksum 0xfe28 (incorrect -> 0xd98c), seq 122, ack 20, win 342, options [nop,nop,TS val 727240069 ecr 727240069], length 0
0x0000: 4500 0034 5e6e 4000 4006 de53 7f00 0001 E..4^n@[email protected]....
0x0010: 7f00 0001 1f91 baa2 02b2 fefe 6799 689e ............g.h.
0x0020: 8010 0156 fe28 0000 0101 080a 2b58 cd85 ...V.(......+X..
0x0030: 2b58 cd85 +X..
This code does not always give connection close header.
Why dont you use self.settings.keep_alive_enabled() ?
if msg.keep_alive().unwrap_or_else(|| self.settings.keep_alive_enabled()) {
@legokichi I think in this case it is supposed to be set in either request or response already.
Though I need to re-check code to be sure
UPD: Settings are used here to determine whether we should start keep-alive timer itself https://github.com/actix/actix-web/blob/cfd9a56ff74dd6cb6ad38b6a67b40ed1763d6071/src/server/h1.rs#L93-L123
Though looking at how we set headers I wonder if we're actually doing it wrong as on HTTP1.1 we should set keep-alive to connection when keep-alive is on, and Close otherwise, while for HTTP1.0 it is always by default should be close and we don't even need to set it
@fafhrd91 What do you think?
UPD: though maybe we need to set keep-alive for HTTP1.0, but I think we should set it for HTTP1.1 too
Relating to https://github.com/actix/actix-web/issues/439#issuecomment-434714357 ,
I confirmed that function poll_flush(true) will never be called when ::actix_web::server::KeepAlive::Timeout(5)
Http1Dispatcher::poll will not be polled after 5 seconds.
No timer will fire.
I checked on master branch today.
Thanks! It is strange because we should set shutdown flag on expiration of timer. I'll try to check code there again when I'll have time
@DoumanAsh i'll handle this
Probably this is necessary.
diff --git a/src/server/h1.rs b/src/server/h1.rs
index a2ffc055..3003a4ac 100644
--- a/src/server/h1.rs
+++ b/src/server/h1.rs
@@ -192,6 +192,7 @@ where
#[inline]
pub fn poll(&mut self) -> Poll<(), HttpDispatchError> {
+ ::futures::task::current().notify();
// check connection keep-alive
self.poll_keepalive()?;
notify forces tokio to poll future again, which should only be ever necessary when we return NotReady and no IO is scheduled
@legokichi should be fixed in master
@neowutran could you test master again?
Testing with live traffic,
service is running for 3 days, no unclosed connection for the moment : )
I will let it run for few more days
( however, there are still some memory leak, from 0.1% to 8% in 3 days )
seems good.
Didn't saw any unclosed connection yet, but a lot of memory leak. Crashed once by out of memory, relaunched, and now needed to stop the service as I released a new version of my software, so much more traffic on live
valgrind log when I stopped the service:
https://neowutran.ovh/valgrind_debug9999.err
used the "/shutdown" to close it gracefully
@neowutran Thanks for log, I'll take a look at it during this weekend, sorry for long response
UPD: So far I cannot make sense out of valgrind output
UPD2: Most likely valgrind itself will not be useful here, but it might point where we have some buffer that gets constantly increased over time
@neowutran Just curious, are errors seens through the whole time while you're running your service, or only from time to time?
I wonder why this bug has no attention for so long time (since Nov 24 2018). If server fails due memory leaks - how you run your web sites on it?
Well it is not easy to understand where the problem yet, and considering upcoming refactoring and change with new major version, it might become irrelevant bug
this should be fixed in 0.7.19
@fafhrd91, @DoumanAsh Hi! Has this issue actually been properly solved? I've been using 1.x version of actix-web for months, had to restart my app every now and then (sometimes after minutes, sometimes after days) since there are a lot of ESTABLISHED connections left there hanging, eventually causing too many open files error. I'm using my server with keep-alive disabled, the rest of the settings are the defaults. I have since tried to upgrade to 2.0.0 to see if it solves the problem, but it's the same thing.
The service itself gets around 500-1000 requests per second in production currently.
@orangesoup Original problem was 0.7 and was solved. Crate new issue
Alright, will do, thanks!
Most helpful comment
I wonder why this bug has no attention for so long time (since Nov 24 2018). If server fails due memory leaks - how you run your web sites on it?