Platform
Settings in IDE
I have this server: 185.205.210.197
(Check that it works: http://185.205.210.197/)
And I have this code in my ESP8266:
#include <ESP8266WiFi.h>
#define HOST "185.205.210.197"
#define PORT 80
#define WIFI_SSID "SSID"
#define WIFI_PSW "PASSWORD"
const char* ssid = WIFI_SSID;
const char* password = WIFI_PSW;
void setup () {
Serial.begin(115200);
WiFi.begin(ssid, password);
while (WiFi.status() != WL_CONNECTED) {
delay(5000);
Serial.println("Connecting to Wifi...");
}
while(!pingServer()) {
delay(3000); //Send a request every 3 seconds
}
}
void loop() {}
bool pingServer() {
WiFiClient client;
if (!client.connect(HOST, PORT)) {
Serial.println("connection failed");
return false;
}
Serial.println("connection success!");
return true;
}
The expected result: Connection success, but got connection failed.
If I try connecting with my PC, it works, only with ESP8266 it doesn't.
Tapping with Wireshark I suspect that there is a problem with ESP8266 TCP packets, here I leave them for you to analyze:
PC TCP connection request (which works):
000af5f4e90c9801a7ad45b708004500004057b6400040066a7dc0a82b49b9cdd2c5dc6d1b3932f96fd800000000b002ffffc71e0000020405b4010303050101080a4ab5122c0000000004020000
(captured using basic ethernet)
Screenshot of an example of a TCP packet that is able to get a server's response.
ESP8266 TCP connection request (which receives no response from server):
000019006f080000794c5e9b00000000126c9e098004d4a10008013c00000af5f4e90c2c3ae80f137f000af5f4e90c9002aaaa0300000008004500002c00270000ff0642f9c0a82b70b9cdd2c5c00d1b3900001c77000000006002086022f9000002040218a9636c75
(captured using monitor mode in Wireshark - has radiotap headers)
Screenshot example of an ESP8266 packet.
The ESP connects fine to other servers.
There is no firewall blocking the connection between ESP and the server.
I've tried this in different networks and conditions, the error persists.
Other computers can connect to the server, so ESP should be able as well.
Tried with different ESP8266 boards to rule out hardware malfunctioning.
Using the latest Arduino IDE and ESP8266 library.
Check that the test server works at http://185.205.210.197:80/ (just open this link basically and see that is running nginx).
Copy and upload the code above to your ESP8266 (NODEMCU or similar).
Open the Arduino IDE console/monitor and confirm that no connection is established to port 80.
Rage in admiration for such a strange bug.
@Pablo2048 Suggested setting the Arduino IDE board option lwIP Variant from v2 Lower Memory to v2 Higher Bandwidth. This increases the TCP MSS (http://lwip.wikia.com/wiki/Tuning_TCP) that goes from around 500 bytes to around 1400 bytes, which is accepted by this server.
Therefore by increasing the MSS, the server starts responding and a TCP connection is established, whereas with a low MSS the server wouldn't even answer back with an error packet (complete silence).
PS. I learned from a comment in Freenode that telnet also appears not to work with that server, with the same error as ESP8266:
telnet 185.205.201.197 80
Telnet works, ip is mistyped in your command: 201 instead of 210.
Right... Ok, so ESP8266 is the only one who cannot access the server. This must indeed be a library bug.
Has anyone been able to reproduce this with the steps I wrote?
with no information about the core and lwip version you are using (no, the questionair is not to tease you) there will be low response ... if you are not willing to spend some time, why should others.
and no, this is not sufficient info:
Using the latest Arduino IDE and ESP8266 library.
@5chufti done.
Ok, so please try to:
@Pablo2048 ok, I just did that, and I think we can confirm that he's parsing the hostname properly.
I attach below:
#include <ESP8266WiFi.h>
#define HOST IPAddress(185,205,210,197)
#define PORT 80
#define WIFI_SSID "redecomfios"
#define WIFI_PSW ""
const char* ssid = WIFI_SSID;
const char* password = WIFI_PSW;
void setup () {
Serial.begin(115200);
WiFi.begin(ssid, password);
while (WiFi.status() != WL_CONNECTED) {
delay(5000);
Serial.println("Connecting to Wifi...");
}
while(!pingServer()) {
delay(3000); //Send a request every 3 seconds
}
}
void loop() {}
bool pingServer() {
WiFiClient client;
if (!client.connect(HOST, PORT)) {
Serial.println("connection failed");
return false;
}
Serial.println("connection success!");
return true;
}
state: 0 -> 2 (b0)
state: 2 -> 3 (0)
state: 3 -> 5 (10)
add 0
aid 5
connected with redecomfios, channel 11
dhcp client start...
cnt
wifi evt: 0
ip:192.168.43.112,mask:255.255.255.0,gw:192.168.43.1
wifi evt: 3
Connecting to Wifi...
:ref 1
:ctmo
:abort
:ur 1
:del
connection failed
pm open,type:2 0
:ref 1
:ctmo
:abort
:ur 1
:del
connection failed
:ref 1
:ctmo
:abort
:ur 1
:del
connection failed
:ref 1
:ctmo
:abort
:ur 1
:del
connection failed
:ref 1
:ctmo
:abort
:ur 1
:del
connection failed
I was curious so I tested your server and I got the same results. I even tested from my app (a forth interp with network commands) and It also couldn't connect to your server. Interestingly, I run many Nginx servers and I've never encountered this problem before, so my first thought is there may be some packet mangling going on in a router that's in front of your server. It's not clear if you tested on the same LAN with local addresses. If you did, that rules out the router being the issue. Does the Nginx server OS have any firewall rules enabled?
I'll look at the packet capture.
Hi @Eszartek thanks for testing.
What I have validated so far (99% sure):
It is not nginx fault, I have tried with netcat and my own TCP server written in Java running in the server, ESP8266 still does not connect, I putted nginx there for you guys to try out this issue, but nginx is not the cause.
I have tested this using my home router, my work router, my smartphone's AP (3G), and it happens no matter what. If there is something blocking the connection, it is not from my side.
_Are the ESP8266 TCP packets being correctly generated? Everything else is being able to connect to the server._
This is creepy as hell. This server is the only server having issues with ESP8266. However, ESP8266 is the only client having issues connecting to this server as well. It is like a romantic relationship that is never going to happen ahah.
According to your Wireshark screenshot it seems like ngnix doesn't respond with SYN ACK. I have no idea why :-( ...
@Pablo2048 yeah, nothing will respond. Every server I put there (nginx, netcat, my custom server) will not respond.
But this only happens with ESP8266, because if you try with your browser (or other tcp client), nginx (and the other servers) responds just fine.
It makes me wonder if ESP8266 packets are properly built.
@Pablo2048 here you can see a TCP packet directed the exact same way as ESP8266 one (to that server, to port 80 - nginx) and it is able to get an SYN ACK answer: http://prntscr.com/iz9h3a
@igrr @mmiscool This is a weird one, someone with deep network knowledge should take a look at it.
Ok, there is one difference in MSS - can you try LWIP v2 but not the lower memory variant?
@Pablo2048 OMG... it connected.
1384, room 16
tail 8
chksum 0x2d
csum 0x2d
v614f7c32
~ld
state: 0 -> 2 (b0)
state: 2 -> 3 (0)
state: 3 -> 5 (10)
add 0
aid 2
cnt
connected with redecomfios, channel 1
dhcp client start...
wifi evt: 0
ip:192.168.69.109,mask:255.255.255.0,gw:192.168.69.1
wifi evt: 3
Connecting to Wifi...
:ref 1
connection success!
:ur 1
:close
:del
pm open,type:2 0
I feel like crying.
Why was MSS limiting the connection? All I know about it comes from this article: http://lwip.wikia.com/wiki/Tuning_TCP
Well it was just blind shot from me ;-) ... It seems like nginx need MSS to be set to bigger value (or there is something like router in the way which needs Don't fragment flag to be set, but lwip lower memory does not set this flag)...
Edit: wait - you wrote that you have tested more servers but no one work so it seems like something in TCP stack configuration on the target machine...
Can you describe "there" with a little more detail. What is the server OS , is there a router with the public IP? It would be interesting to know what equipment to stay away from :)
I doubt that the issue is Nginx, I'd think something else is handling the packet first and making the call to drop it. It could be the OS Nginx is running on.
@Eszartek agree with that...
Perhaps related to https://support.citrix.com/article/CTX214610?
@igrr that definitely fits this scenario. I'm really curious to know if it is a router or a an ISP level system in place that's causing the drop.
Hey everyone,
this is not a problem with NGINX
practically, any server I put there gives this same error.
The issue must be with the VPS itself, or some firewall in front of it filtering TCP packets. I have no idea but I will try to find out, since I am not the owner providing that VPS.
All I know is that the VPS is running Ubuntu 16.
Welcome to Ubuntu 16.04.3 LTS (GNU/Linux 4.4.0-87-generic x86_64)
* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage
Last login: Sat Mar 31 19:01:16 2018 from 109.48.194.56
$ uname -a
Linux unassigned-hostname 4.4.0-87-generic #110-Ubuntu SMP Tue Jul 18 12:55:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
@Eszartek @igrr strange that in that article you linked, their example at least receives a SYN ACK, in this scenario ESP receives nothing from the server.
So basically, for some reason ESP8266 MSS is 536 bytes by default, but the server requires around 1400 bytes to work properly.
I edited the OP with this workaround. I'll try to learn more about this server.
Does the Ubuntu server have the public IP address 185.205.210.197 and if so, can you list the iptables rules loaded on the server?
@Eszartek in theory yes, that public IP is from that Ubuntu server.
$ iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source destination
LOG tcp -- anywhere anywhere tcp dpt:9193 state NEW LOG level alert prefix "New Connection "
LOG tcp -- anywhere anywhere tcp dpt:5901 state NEW LOG level alert prefix "New Connection "
Chain FORWARD (policy DROP)
target prot opt source destination
DOCKER-ISOLATION all -- anywhere anywhere
DOCKER all -- anywhere anywhere
ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED
ACCEPT all -- anywhere anywhere
ACCEPT all -- anywhere anywhere
DOCKER all -- anywhere anywhere
ACCEPT all -- anywhere anywhere ctstate RELATED,ESTABLISHED
ACCEPT all -- anywhere anywhere
ACCEPT all -- anywhere anywhere
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
Chain DOCKER (2 references)
target prot opt source destination
ACCEPT tcp -- anywhere 172.18.0.2 tcp dpt:postgresql
ACCEPT tcp -- anywhere 172.18.0.3 tcp dpt:6379
ACCEPT tcp -- anywhere 172.18.0.4 tcp dpt:9042
ACCEPT tcp -- anywhere 172.18.0.4 tcp dpt:afs3-fileserver
Chain DOCKER-ISOLATION (1 references)
target prot opt source destination
DROP all -- anywhere anywhere
DROP all -- anywhere anywhere
RETURN all -- anywhere anywhere
$ iptables -S
-P INPUT ACCEPT
-P FORWARD DROP
-P OUTPUT ACCEPT
-N DOCKER
-N DOCKER-ISOLATION
-A INPUT -p tcp -m tcp --dport 9193 -m state --state NEW -j LOG --log-prefix "New Connection " --log-level 1
-A INPUT -p tcp -m tcp --dport 5901 -m state --state NEW -j LOG --log-prefix "New Connection " --log-level 1
-A FORWARD -j DOCKER-ISOLATION
-A FORWARD -o br-ec854a8fb357 -j DOCKER
-A FORWARD -o br-ec854a8fb357 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -i br-ec854a8fb357 ! -o br-ec854a8fb357 -j ACCEPT
-A FORWARD -i br-ec854a8fb357 -o br-ec854a8fb357 -j ACCEPT
-A FORWARD -o docker0 -j DOCKER
-A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT
-A DOCKER -d 172.18.0.2/32 ! -i br-ec854a8fb357 -o br-ec854a8fb357 -p tcp -m tcp --dport 5432 -j ACCEPT
-A DOCKER -d 172.18.0.3/32 ! -i br-ec854a8fb357 -o br-ec854a8fb357 -p tcp -m tcp --dport 6379 -j ACCEPT
-A DOCKER -d 172.18.0.4/32 ! -i br-ec854a8fb357 -o br-ec854a8fb357 -p tcp -m tcp --dport 9042 -j ACCEPT
-A DOCKER -d 172.18.0.4/32 ! -i br-ec854a8fb357 -o br-ec854a8fb357 -p tcp -m tcp --dport 7000 -j ACCEPT
-A DOCKER-ISOLATION -i docker0 -o br-ec854a8fb357 -j DROP
-A DOCKER-ISOLATION -i br-ec854a8fb357 -o docker0 -j DROP
-A DOCKER-ISOLATION -j RETURN
$ ifconfig -a
(...)
eth0 Link encap:Ethernet HWaddr 96:20:1b:82:ba:c4
inet addr:185.205.210.197 Bcast:185.205.210.255 Mask:255.255.255.0
inet6 addr: fe80::9420:1bff:fe82:bac4/64 Scope:Link
inet6 addr: 2a07:5741:0:1160::1/64 Scope:Global
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:98858771 errors:11229226 dropped:0 overruns:0 frame:11229226
TX packets:1564862 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:7330778108 (7.3 GB) TX bytes:180059437 (180.0 MB)
(...)
I was expecting to see something along the lines of:
iptables -t mangle -A PREROUTING -p tcp -m conntrack --ctstate NEW -m tcpmss ! --mss 1420:65535 -j DROP
Possibly related: https://github.com/moby/moby/issues/26473 .
Looks like it may be inside the container.
Even if the issue is not due to esp/arduino's core, and for maximum compatibility,
1460 MSS can be reverted to default leaving 536 MSS as an option.
@igrr @devyte @earlephilhower what do you think ?
I haven't had any issues with MSS @536, so I'm inclined to suspect something in OP's particular setup as the culprit. Unlessnther thisnis becomes widely reported, I'd rather not up the default heap usage for everyone else just due to this case.
The default (small MSS) seems to work for a large majority, there is a simple change to allow it to work in this exceptional instance (menu option during build).
I see no need to change anything here, it's actually all working like a champ, no? No bug in sight, just some specific combination at a VPS hoster...
Hey everyone, I have news from the VPS hoster:
Dear Customer,
Hello,
Your conclusion is right. We have custom firewall which filter lower tcp mss flag than 500. If you check RFC 6691 by the calculations we had never expect lower value. Most of packets generated and filled with lower mtu is scanning and flooding scripts. We also check all other tcp header flags like seq/ack number, win flag, valid timestamp. This is related with the tcp protocol, doesn't matter what software you use for connections (curl, nc, wget, etc).
I can advise you to set bigger values on the packets that your OS is generating, otherwise you will have same problems with many other hosts which have similar filters.
What do you people think?
filter lower tcp mss flag than 500
MSS for memory lwip is 536.
Whatever else, it's not an issue with the core. Closing.
I'm experiencing nearly the same symptoms:
2.0 low memory to 2.0 high bandwidth. This changed the MSS=1460 and window size of 5840. Everything else about the SYN packet seems identical. Symptoms persist.1.4 high bandwidth. This makes every connection work perfectly. I cannot see any difference between this SYN and the previous one, so I'm at a loss for why this works.I have to comment that when using two ESP8266-01s, one setup as the example HTTP server and another running a basic HTTPClient code, they only talk client to Server when using lwip 1.4 high bandwidth.
I have tried v2 higher bandwidth and that didn't work either.
These are two devices sitting next to each other on the bench, connecting to the same wifi service as STA.
The version of Arduino is 1.89.
The version of ESP8266 is 2.6.3
The version of xtensa-lx106-elf-gcc\2.5.0-4-b40a506
So it seems to go againstthe grain if it just being a straight MSS config - it changes with lwip too.
any one who got the solution?
My only suggestion is to use LWIP1.4 higher bandwidth version.
This only works on other LWIP settings when the server is making outbound connections. (e.g. maintaining an open MQTT connection ) and that is sporadic and seemingly linked to refreshing the server's client connection to an external service at the server.
This is a closed issue. Please open a new one.
@pimby it was suggested to test with the higher bandwidth lwIP variant. Did you try it ?
@skybadger I suggest you open a new issue with your data. This issue is closed and the problem was found and caused by a server not accepting low MSS. Your issue is different.
Just add this limitation to the ESP documentation. No need for a fix, but documentation is good.
Most helpful comment
with no information about the core and lwip version you are using (no, the questionair is not to tease you) there will be low response ... if you are not willing to spend some time, why should others.
and no, this is not sufficient info: