My code is an almost equal to WiFiClientSecure. I was experimenting with this quite a while, then one time the Wifi stopped working. Everything else is still ok. The output is:
Attempting to connect to SSID: dd-wrt
[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 2 - STA_START
...[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 5 - STA_DISCONNECTED
[W][WiFiGeneric.cpp:270] _eventCallback(): Reason: 201 - AUTH_FAIL
..........
After a lot of debugging and trying to find the source in the AP, I used another board and tried the same code. The Wifi worked. Then, did some more tests and added my changes again. At the end, also this board stopped connecting to Wifi.
Besides the AP configuration, I didn't do any change but this: I added after client.stop():
WiFi.disconnect(true)
Is it possible that this command physically breaks my board? Maybe after running 30 minutes after issuing the command? Or does it maybe change the board state in a persistent way (even after powering off) so that I have to issue another command to unlock it? Or is there a tool to do a full memory reset?
My board is a Doit ESP32 Devkit V1
I didn't attach any extra cables to the second board and also cared about electrostatic discharge
Ok, I was able to factory-reset the board using the following command from page https://community.platformio.org/t/is-my-esp32-dead/1993:
python /home/user/Arduino/hardware/espressif/esp32/tools/esptool.py --chip esp32 --port /dev/ttyUSB0 --baud 921600 --before default_reset --after hard_reset erase_flash
Now, I'm able to connect again using the same code. Will tell you in a few days if the disconnect problem appears again.
Attempting to connect to SSID: dd-wrt
[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 2 - STA_START
..[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 4 - STA_CONNECTED
[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 7 - STA_GOT_IP
Connected to dd-wrt
192.168.168.108
I don't know that it breaks the board, but I have observed that with the most recent version, calling disconnect can get the board into a state where it will always return AUTH_FAIL. I spent the better part of the weekend trying to find a consistency to the problem and a way to recover but had no luck in the end.
I observed a similar issue,
I am not using the arduino code. But I am calling esp_wifi_stop(). And I think the arduino code also calls esp_wifi_stop() when you call WiFi.disconnect(true).
It works for a while but suddenly it can stop working and all connect attempts return with reason 201.
Also I am using the same board a Doit ESP32 Devkit V1.
It happened again, so I changed my code and found a way which works so-so
wifi off:
WiFi.mode(WIFI_OFF);
wifi on:
WiFi.begin(ssid, password);
WiFi.mode(WIFI_STA);
WiFi.reconnect();
delay(1000);
while (WiFi.status() != WL_CONNECTED) {
But the complete reconnecting code has to be called multiple times, waiting in the loop below is not enough. Then I get a connection after ~2 or 3 times.
The code behind is below. Maybe I should also directly use the esp_* implementations. There seem to be space for optimizations...
bool WiFiSTAClass::reconnect()
{
if((WiFi.getMode() & WIFI_MODE_STA) != 0) {
if(esp_wifi_disconnect() == ESP_OK) {
return esp_wifi_connect() == ESP_OK;
}
}
return false;
}
bool WiFiSTAClass::disconnect(bool wifioff)
{
bool ret;
wifi_config_t conf;
*conf.sta.ssid = 0;
*conf.sta.password = 0;
WiFi.getMode();
esp_wifi_start();
esp_wifi_set_config(WIFI_IF_STA, &conf);
ret = esp_wifi_disconnect() == ESP_OK;
if(wifioff) {
WiFi.enableSTA(false);
}
return ret;
}
bool WiFiGenericClass::mode(wifi_mode_t m)
{
wifi_mode_t cm = getMode();
if(cm == WIFI_MODE_MAX){
return false;
}
if(cm == m) {
return true;
}
esp_err_t err;
err = esp_wifi_set_mode(m);
if(err){
log_e("Could not set mode! %u", err);
return false;
}
if(m){
return espWiFiStart();
}
return espWiFiStop();
}
Hi,
so sorry that we cannot reproduce your problem.
Could you provide a test project with your source code and more details about your test step? So that we can reproduce your problem and help debug it.
Thanks.
It took a wile to reproduce it again. The program tries to reconnect every 20 seconds, and after 1..2 hours it stops working and this state also survives a power cycle.
led explanation:
@daald Any advance on this topic ?
I'm also getting strange behaviors without changing my code and having troubles to connect to wifi.
As of today, no one has a method with a solid 100% success connection to wifi ?? :/
[edit] I tried your command but had to lower the clock speed to 115200 to make it work, otherwise I had the following error:
A fatal error occurred: Timed out waiting for packet header
Plus using WiFi.mode(WIFI_OFF); and so far so good :)
progress? hard to say. I don't have much time to investigate more. my esp32 connects once a day to the server and so far this worked every day. but i record some statistics. the last days:
3 days, each 100% success
1 day, success after 3 fails
1 day, 100% success
1 day, success after 52 fails
1 day, success after 22 fails
3 days, each 100% success
I can also see that most of the tries the chip reaches STA_CONNECTED but then fails connecting to the server (probably no IP or no data is transferred at all). in this case, the code doesn't call WIFI_OFF but again the full 5 lines block from above for connecting.
I will probably not change anything anymore as long as it's not gonna worse
@daald @lonerzzz @tdesmet @henricazottes Looks like a NVS problem, either phy calibration data or WiFi parameters corrupted. I tend to believe that phy calibration was corrupted. Last master or idf3.0 has a fix that makes you can do phy calibaration every time you boot up. @tdesmet If you do this, you can quickly check if I am right or wrong. Please seem menu Components config->PHY->Store calibration data in NVS. Also, you can provide the NVS data here.
@jack0c
Do phy calibration and store calibration data in NVS is checked.
Use a partition to store PHY init data is not checked.
Also I am using a custom partition table
# Espressif ESP32 Partition Table
# Name, Type, SubType, Offset, Size
nvs, data, nvs, 0x9000, 0x4000
otadata, data, ota, 0xd000, 0x2000
phy_init, data, phy, 0xf000, 0x1000
factory, 0, 0, 0x10000, 1M
ota_0, 0, ota_0, , 1M
ota_1, 0, ota_1, , 1M
storage, data, spiffs, , 512K,
At the moment I can not reliably replay the issue, it happens some times. Is there a way I can check in code if the the phy calibration was corrupted and what caused the corruption? Is there a simple way to fix the corruption when it occurs?
@daald could you let me know the implementation of you WiFi event handler? And how/when to reconnect the WiFi wifi SYSTEM_EVENT_STA_DISCONNECTED happens?
@liuzfesp I don't have an explicit event handler. WiFiGeneric.cpp is part of the Arduino implementation from https://github.com/espressif/arduino-esp32.git (currently rev 70d0d464 from Dec 19 2017 - obviously I updated once after I reported this bug, after I found another bug report which was marked as solved. unfortunately, I can't find it anymore, first installation time was Nov 25), I enabled logging in the Arduino 1.8.5 IDE under Tools->Core Debug Level->DEBUG. Currently, I'm happy how it works - not perfect but good enough for my use case.
@tdesmet Currently, we do not have a way to check the NVS calibration data is OK or not.
However, you can do calibration every time, and need not save to NVS.
@jack0c if I uncheck Do phy calibration and store calibration data in NVS will it do calibration every time or do I have to call a function?
@tdesmet It will do calibration every time.
Guess this espressif/arduino-esp32#1001 thread relates also to the same underlying issue.
Can someone please enlighten me as to when and how initial RF calibration data is stored in NVS? I wrote a simple program to print out the data and ran it on some of my ESP32s. If a particular chip had connected to a WiFi network previously then calibration data was present (whether it is correct or not is a different matter). But if it was a new chip then esp_phy_load_cal_data_from_nvs() returned an error. Is the RF calibration data written at the factory? If so why can't I read it?
@tferrin In latest idf, we have two options :
Thanks @jack0c for the information! I have developed code to do option #2 in the Arduino IDE. Because of your Jan 13 post above I thought that perhaps the RF calibration data stored in NVS may be getting corrupted somehow, so in the code I developed I also store a MD5 hash of the data so that I can later confirm it is still valid. So far I've not seen any NVS data corruption, but my systems have only been running for a few days and I haven't pushed very many new code downloads to them yet. If I find that NVS data is getting corrupted I will definitely open a new issue.
@daald @lonerzzz @tdesmet @henricazottes @tferrin Any problem?
As far the RF calibration data stored in NVS getting corrupted, I have not observed it. As I noted in my post above, I wrote a program to store the MD5 hash of the calibration data in NVS and it's still valid in my test systems 3 weeks later now. Prior to computing and storing the MD5 hash, it is impossible to say if the calibration data had been corrupted or not. And from re-reading the posts above, it's not clear to me that this issue was ever confirmed to be an NVS data corruption problem.
Reading other threads about WiFi issues, I see things come up pretty frequently about AUTH_FAIL problems and all of these may be related to the lingering issue with reconnect() or some combination of other function calls that put the WiFi subsystem into a bad state.
@tferrin yes, the WiFi connect relating issue is one of the most frequently reported issues, we hope to make the WiFi connection issue debug easier by adding more debug log into wifi library in IDF v3.1.
I might also have ran into this problem, have two test boards out there that worked fine for about 3 months, running code that scans all access points in range, connects to a specified network and then sends the scan data as a http request, sleeping for a minute between successful transmissions, panic handler set to reboot and there is a watchdog also rebooting on bark.
Power cycling the boards didn't help, the ap scans are coming out mostly empty, in an environment where before a scan usually returned 10+ access points.
The code has been compiled with 1e0710f1b24429a316c9c34732aa17bd3f189421 as head. Toolchain version 1.22.0-61
@tferrin could you please share your code which checks the NVS and rebuilds if corrupted? I have a board which is failing with it after I uploaded a new version of my working sketch (same working in another board, so sketch is fine). I want to try if NVS is the reason for my board failure.
Sure; attached below. (The attachment is name RFcal.txt but should be renamed to RFcal.ino so the Arduino IDE environment recognizes it as a sketch. I could not attach it with the .ino extension because github doesn't recognize that as an attachable file.)
@daald lonerzzz @tdesmet @henricazottes we have supported WiFi scan/connect debug log in IDF v3.3, if you encounter WiFi connection issues again, please take following steps to debug it:
Then we will analyze the WiFi log.
@daald @tdesmet @henricazottes Hi, all, could you help share if any updates for the issue? Thanks.
Sorry I have no opportunity to test it right now and won't have in a near future :/
After disabling the bssid and calibration data storage to nvs in the
menuconfig ~60 devices have been quite happy on the field for 6 months.
Sleeping for 60 seconds and then waking up and sending a http request via
wlan.
First prototypes failed after about 3 months which equals the given 100k
write endurance of the flash in my boards quite closely.
Will close this for now, please poke me to reopen if you still have the issue, thanks
Sure; attached below. (The attachment is name RFcal.txt but should be renamed to RFcal.ino so the Arduino IDE environment recognizes it as a sketch. I could not attach it with the .ino extension because github doesn't recognize that as an attachable file.)
i flashed your code but it seems everything is ok tough this this infinite AUTH_EXP loop still present :(
Hi @Iron24fit, could you help to check it with de798352 or latest IDF v3.1/3.2/3.3/master (you can choose any one of them to check)?
@liuzfesp Thanks for answer, surprisingly, my problem was solved by changing my device. Maybe an hardware failure like i have seen with other members that emit this issue of a few percent of devices failing.
Sorry for my english ^^'.
I just reactivated this project. Unfortunately, the first thing I did was updating the esp32 and arduino sw. Now, with my example from above
I can't reproduce it at all anymore. But in a negative way: It always connects the first time, but never again afterwards, until I restart the board by flashing or pushing the reset button or power cycle.
I tried to restore the old esp32 files but no success, for whatever reason. But in the new code: did something relevant change?
Just to let you know: in my newest experiments, I use the following code in the beginning of the setup() function:
WiFi.persistent(false);
fixWifiPersistencyFlag();
while fixWifiPersistencyFlag() is defined as:
#include "esp_wifi.h" // only for fixWifiPersistencyFlag()
/**
* Disable persistent mode, see https://github.com/espressif/arduino-esp32/issues/1393
*/
void fixWifiPersistencyFlag() {
wifi_init_config_t cfg = WIFI_INIT_CONFIG_DEFAULT();
Serial.printf("cfg.nvs_enable before: %d\n", cfg.nvs_enable);
cfg.nvs_enable = 0;
}
I cannot really prove (yet) that it works but if feels like..
For the reconnect problem, I'm now following and commenting https://github.com/espressif/arduino-esp32/issues/653
Most helpful comment
@tferrin yes, the WiFi connect relating issue is one of the most frequently reported issues, we hope to make the WiFi connection issue debug easier by adding more debug log into wifi library in IDF v3.1.