Client: ss-libev Android 4.1.8
Server: 3.0.7
Phone: Pixel XL with Android 8.0
Server: A KVM VPS running Ubuntu 16.04.3 LTS x64 with kernel 4.12.6-041206-generic
sudo sysctl net.ipv4.tcp_fastopen=3 at the server side2017-09-07 06:55:45 INFO: using tcp fast open
2017-09-07 06:55:45 INFO: UDP relay enabled
2017-09-07 06:55:45 INFO: initializing ciphers... salsa20
2017-09-07 06:55:45 INFO: using nameserver: 8.8.8.8
2017-09-07 06:55:45 INFO: using nameserver: 8.8.4.4
2017-09-07 06:55:45 INFO: tcp server listening at 0.0.0.0:nnnn
2017-09-07 06:55:45 INFO: udp server listening at 0.0.0.0:nnnn
The ss-libev Android app should display "Success: xxxms latency".
The app displayed "Internet Unavailable" and "Failed to detect internet connection: SSL handshake timed out".
Bypass LAN & mainland China, GFW List (both tested)sysctl net.ipv4.tcp_fastopen=3SALSA20, CHACHA20-IETF-POLY1305 (both tested)This issue can only be reproduced with the following combination:
net.ipv4.tcp_fastopen is set to 3I also tested other combinations and they all seem to work, e.g.:
net.ipv4.tcp_fastopen=1net.ipv4.tcp_fastopen=3net.ipv4.tcp_fastopen=3net.ipv4.tcp_fastopen=3Please also note Nexus 6P has TFO off at the phone side so the value of net.ipv4.tcp_fastopen doesn't matter at all.
I did a quick tcpdump and it shows that when TFO=3 at server side, ss-server did send back data to the phone side correctly. It's just the phone never receive those packets.
So, it looks like China Mobile's firewall does have issue(s) with TFO cookies.
I'm fully aware that this might not be a ss-libev's problem at all, but maybe this kind of quirks should be documented somewhere so that less "ss is being detected/blocked" cries shall be made?
It's expected that TFO won't work behind some broken NAT device, e.g. China Mobile's WAN gate.
I marked this issue as "not a bug" in case anyone else report the same problem in the future.
Android端测速是不是通过ss-tunnel走udp出去的?
@wongsyrone Nope. It just counts the time of accessing www.google.com.
好吧,当我没说。上游有个udp checksum的修复,有兴趣可以加上:https://github.com/ambrop72/badvpn/commit/ffd16e27d0bd58fec068fa9271b33fe559efb5a5
Server:libev 3.1.3
Client:android 4.4.6
在我的Xperia(net.ipv4.tcp_fastopen=1)上,联通4G和电信宽带环境下都抓不到带有TFO-COOKIE的数据包。
这篇文章你可以阅读一下:
http://blog.donatas.net/blog/2017/03/09/tfo/
我将tcp_fastopen_blackhole_timeout_sec设为0,客户端需要较长时间才能返回数据,且退回为传统连接方式。
但是在国外的两台服务器通信的确有抓到cookie且第二次建立连接时client带上了cookie,说明ss-libev是没有问题的。
我的电信4G信号也出现TFO下DNS数据返回的问题,wifi下使用正常。只好把服务器端TFO关闭。
试了下把 net.ipv4.tcp_fastopen=改为1,测试通过但延迟比关闭TFO要高些。
Have anyone tested TCP Fast Open using Alibabacloud ECS yet? I got exactly the same problem here:
Client: ss-tunnel (modified: https://github.com/RethinkMax/shadowsocks-libev)
Server: ssserver (also as a sniproxy client: https://github.com/PantherJohn/sniproxydpl)
with fast_open enabled in conf and net.ipv4.tcp_fastopen set to 3 on both sides, fo cookie requested but never received by the remote side:
# ssserver
TCPFastOpenActive: 7472
TCPFastOpenActiveFail: 5147
TCPFastOpenBlackhole: 8
note that the increase in TCPFastOpenActive is caused by the outbound connection to Google, Youtube etc (even local loopback can do that because all requests are forwarded by ssserver to sniproxy) . On client side I merely had:
TCPFastOpenCookieReqd: 29
<serveraddr> age 544.728sec rtt 162750us rttvar 44000us cwnd 18 metric_5 1302359 metric_6 176213 fo_mss 1460 fo_syn_drops 2/609.658sec ago fo_cookie 5b34b885cb57aecf
it’s not a surprise that NAT(probably?) is doing something nasty dropping inbound packets with unknown TCP options as well as fo cookies.
# SNIPROXY Server deployed on vultr
TCPFastOpenActive: 38
TCPFastOpenPassive: 31
TCPFastOpenCookieReqd: 4
Some middleware, such as firewalls and NAT boxes may cause issues with the new TCP option. Additionally, because the Linux continues to set the TFO option to 254, which is the experimental kind, it maybe more likely to be dropped. It’s even been reported some middleware boxes, after detecting the TFO option in the initial SYN packet, drop subsequent SYN packets without the TFO option. Also, if a device is behind a Carrier Grade NAT (CGN) with many public IP addresses constantly changing, a cookie may become invalidated often, reducing the effectiveness of TFO. High latency mobile devices which benefit the most from TFO are also most likely to be affected by changing public IP addresses due to CGNs. I currently have no data on this.
当带有TFO的包经过路由器 可能会被丢包 不同运营商的策略也不同
如果配置了NAT地址池 在第二次连接时可能会成功 取决于NAT表老化时间
客户端IP改变和NAT之后的公网IP改变都会影响TFO的正常使用
服务端收到不正确的包后依旧可以发送syn-ack 且退回为3WHS
https://www.juniper.net/documentation/en_US/junos/topics/task/configuration/tfo-configuring.html
https://tools.ietf.org/html/draft-cheng-tcpm-fastopen-02#section-5
Network Address Translation (NAT)
The hosts behind NAT sharing same IP address will get the same cookie to the same server. This will not prevent TFO from working. But on some carrier-grade NAT configurations where every new TCP connection from the same physical host uses a different public IP address, TFO does not provide latency benefit. However, there is no performance penalty either as described in Section "Client: Receiving SYN-ACK".
Client: Receiving SYN-ACK
The client SHOULD perform the following steps upon receiving the SYN-ACK:
- Update the cookie cache if the SYN-ACK has a Fast Open Cookie Option.
- Send an ACK packet. Set acknowledgment number to RCV.NXT and include the data after SND.UNA if data is available.
- Advance to the ESTABLISHED state.
Note that is no latency penalty if the server does not acknowledge the data in the original SYN packet. The client can retransmit it in the first ACK packet in step 2. The data exchange will start after the handshake like a regular TCP connection.
可以选择在服务端单方面将模式改为1或者客户端改为0 不影响服务端对外连接的性能
I'm using LEDE 17.01.6 with "net.ipv4.tcp_fastopen = 3", so is my Ubuntu 18.04 server.
That's to say, I've already make TFO config on both server and router.
However, when I set tcp fast open to true on my router, the log shows :
Sat Oct 6 13:52:19 2018 daemon.info ss-redir[2915]: listening at 0.0.0.0:1234
Sat Oct 6 13:52:19 2018 daemon.info ss-redir[2915]: UDP relay enabled
Sat Oct 6 13:52:19 2018 daemon.info ss-redir[2915]: udp port reuse enabled
Sat Oct 6 13:52:19 2018 daemon.info ss-redir[2915]: TCP relay disabled
Sat Oct 6 13:52:19 2018 daemon.info ss-redir[2915]: running from root user
Sat Oct 6 13:52:22 2018 daemon.err ss-redir[2881]: failed to set TCP_FASTOPEN_CONNECT
By the way, both my router and server runs the newest version of shadowsocks-libev.
What's the matter?
Once I degraded to 3.1.3-4 version, everything's fine.
@Nick-Hopps seems they have removed TCP_FASTOPEN_CONNECT from linux kernel headers. On my machine (CentOS 7.4 Linux kernel 4.18) TCP_FASTOPEN_CONNECT is not defined. Recompiling shadowsocks-libev on your LEDE router may help.
@PantherJohn Thank you, though I don't think my little router can withstand compiling anything haha.
@Nick-Hopps @librehat After all according to your statement the release is not backward compatible -- this is the issue to be fixed.
Most helpful comment
@wongsyrone Nope. It just counts the time of accessing www.google.com.