Tasmota Device does not respond to ARP

Created on 3 Dec 2019  路  91Comments  路  Source: arendst/Tasmota

PROBLEM DESCRIPTION

Periodically my Tasmota Devices stop responding to ARP requests. This looks to have been a historical problem that was recently fixed here but I still have the same behaviour on Tasmota 7.1.1. I don't experience this issue on another device running ESPHome - I'm wondering if a change needs to be made to Tasmota to accommodate this fix?

REQUESTED INFORMATION

_Make sure your have performed every step and checked the applicable boxes before submitting your issue. Thank you!_

  • [X] Read the Contributing Guide and Policy and the Code of Conduct
  • [X] Searched the problem in issues
  • [X] Searched the problem in the wiki
  • [X] Searched the problem in the forum
  • [X] Searched the problem in the chat
  • [X] Device used (e.g., Sonoff Basic): _____
  • [X] Tasmota binary firmware version number used: 7.7.1

    • [X] Pre-compiled

    • [ ] Self-compiled

    • [ ] IDE / Compiler used: _____

  • [X] Flashing tools used: OTA via HTTP
  • [X] Provide the output of command: Backlog Template; Module; GPIO:
  13:38:42 MQT: tasmota_92E5D5/stat/RESULT = {"NAME":"Teckin SP23","GPIO":[0,255,56,255,0,134,0,0,131,17,132,21,0],"FLAG":0,"BASE":45}
13:38:42 MQT: tasmota_92E5D5/stat/RESULT = {"Module":{"0":"Teckin SP23"}}
13:38:42 MQT: tasmota_92E5D5/stat/RESULT = {"GPIO1":{"0":"None"},"GPIO3":{"0":"None"}}
  • [x] If using rules, provide the output of this command: Backlog Rule1; Rule2; Rule3:
  Rules output here:
  • [x] Provide the output of this command: Status 0:
  13:39:08 MQT: tasmota_92E5D5/stat/STATUS = {"Status":{"Module":0,"FriendlyName":["Tasmota Outdoor Xmas Lights"],"Topic":"tasmota_92E5D5","ButtonTopic":"0","Power":0,"PowerOnState":3,"LedState":1,"LedMask":"FFFF","SaveData":1,"SaveState":1,"SwitchTopic":"0","SwitchMode":[0,0,0,0,0,0,0,0],"ButtonRetain":0,"SwitchRetain":0,"SensorRetain":0,"PowerRetain":0}}
13:39:08 MQT: tasmota_92E5D5/stat/STATUS1 = {"StatusPRM":{"Baudrate":115200,"GroupTopic":"sonoffs","OtaUrl":"http://thehackbox.org/tasmota/release/tasmota.bin","RestartReason":"Software/System restart","Uptime":"0T00:46:23","StartupUTC":"2019-12-03T12:52:45","Sleep":1,"CfgHolder":4617,"BootCount":25,"SaveCount":135,"SaveAddress":"F9000"}}
13:39:08 MQT: tasmota_92E5D5/stat/STATUS2 = {"StatusFWR":{"Version":"7.1.1(tasmota)","BuildDateTime":"2019-12-01T13:00:09","Boot":4,"Core":"2_6_1","SDK":"2.2.2-dev(38a443e)","Hardware":"ESP8266EX"}}
13:39:08 MQT: tasmota_92E5D5/stat/STATUS3 = {"StatusLOG":{"SerialLog":2,"WebLog":2,"MqttLog":0,"SysLog":0,"LogHost":"","LogPort":514,"SSId":["BETA-2.4Ghz",""],"TelePeriod":300,"Resolution":"558180C0","SetOption":["000A8009","2805C8000100060000005A00000000000000","00000200","00000000"]}}
13:39:08 MQT: tasmota_92E5D5/stat/STATUS4 = {"StatusMEM":{"ProgramSize":562,"Free":440,"Heap":23,"ProgramFlashSize":1024,"FlashSize":1024,"FlashChipId":"144068","FlashMode":3,"Features":["00000809","8FDAE397","043683A0","22B617CD","01001BC0","00007881"],"Drivers":"1,2,3,4,5,6,7,8,9,10,12,16,18,19,20,21,22,24,26,29","Sensors":"1,2,3,4,5,6,7,8,9,10,14,15,17,18,20,22,26,34"}}
13:39:08 MQT: tasmota_92E5D5/stat/STATUS5 = {"StatusNET":{"Hostname":"tasmota_92E5D5-1493","IPAddress":"10.0.130.117","Gateway":"10.0.130.1","Subnetmask":"255.255.255.0","DNSServer":"10.0.130.1","Mac":"BC:DD:C2:92:E5:D5","Webserver":2,"WifiConfig":4}}
13:39:08 MQT: tasmota_92E5D5/stat/STATUS6 = {"StatusMQT":{"MqttHost":"mqtt.lan.marrold.co.uk","MqttPort":1883,"MqttClientMask":"tasmota_%06X","MqttClient":"tasmota_92E5D5","MqttUser":"tasmota","MqttCount":2,"MAX_PACKET_SIZE":1000,"KEEPALIVE":30}}
13:39:08 MQT: tasmota_92E5D5/stat/STATUS7 = {"StatusTIM":{"UTC":"Tue Dec 03 13:39:08 2019","Local":"Tue Dec 03 13:39:08 2019","StartDST":"Sun Mar 31 02:00:00 2019","EndDST":"Sun Oct 27 03:00:00 2019","Timezone":"+00:00","Sunrise":"07:24","Sunset":"15:55"}}
13:39:08 MQT: tasmota_92E5D5/stat/STATUS9 = {"StatusPTH":{"PowerDelta":0,"PowerLow":0,"PowerHigh":0,"VoltageLow":0,"VoltageHigh":0,"CurrentLow":0,"CurrentHigh":0}}
13:39:08 MQT: tasmota_92E5D5/stat/STATUS10 = {"StatusSNS":{"Time":"2019-12-03T13:39:08","ENERGY":{"TotalStartTime":"2019-10-30T21:55:42","Total":0.179,"Yesterday":0.068,"Today":0.034,"Power":0,"ApparentPower":0,"ReactivePower":0,"Factor":0.00,"Voltage":0,"Current":0.000}}}
13:39:08 MQT: tasmota_92E5D5/stat/STATUS11 = {"StatusSTS":{"Time":"2019-12-03T13:39:08","Uptime":"0T00:46:23","UptimeSec":2783,"Heap":23,"SleepMode":"Dynamic","Sleep":1,"LoadAvg":999,"MqttCount":2,"POWER":"OFF","Wifi":{"AP":1,"SSId":"BETA-2.4Ghz","BSSId":"6E:3B:6B:27:B2:F1","Channel":6,"RSSI":74,"LinkCount":2,"Downtime":"0T00:00:11"}}}
  • [ ] Provide the output of the Console log output when you experience your issue; if applicable:
    _(Please use_ weblog 4 _for more debug information)_
  Console output here:

TO REPRODUCE

Connect to Wifi and wait a period of time - the device will stop responding to ARP requests

EXPECTED BEHAVIOUR

The device should _always_ respond to ARP requests

SCREENSHOTS

_If applicable, add screenshots to help explain your problem._

ADDITIONAL CONTEXT

_Add any other context about the problem here._

(Please, remember to close the issue when the problem has been addressed)

Add to Docs troubleshooting workaround

Most helpful comment

The advice from the forum and Mikrotik Support is to set multicast-helper to full - I have always assumed this was a workaround rather than a fix, but someone on the forum post above linked to documents from several vendors (Including Aruba, TP-Link) stating that they enable it by default.

I have now configured multicast-helper=full and can confirm I reliably receive ARP replies from the affected Tasmota and Android devices.

I guess the mystery is solved, but it would be good to get the Gratuitous ARP change tested as perhaps not everyone will be able to change this setting or have access to the Access Point Config.

Thanks

All 91 comments

Please, in the console type sleep 0 and try again for testing.

_(Just for reference, original issue comes from https://github.com/esp8266/Arduino/issues/6873)_

Done, will report back shortly. Thanks

@ascillato I'm afraid Sleep 0 will have the opposite effect, leaving little time to the Wifi stack to respond to ARP requests.

@marrold can you also try Sleep 50?

@s-hadinger He is using "SleepMode":"Dynamic","Sleep":1 according to his status 0.

  • Sleep 0 test was about turning down the adaptive sleep. To see if that improves it. The main loop leaves time for the SDK and it is called sooner on sleep=0.

Anyway, you are totally right about also testing the Tasmota Default: sleep 50

@marrold Please, perform both tests:

  • Sleep 0

and

  • Sleep 50

Sleep 0 is currently looking good, I get a response to an ARP request every time - but I will need to leave it over night to be sure.

I had the ARP issue with default settings ( Sleep 50 ) so I don't think this will help, but I will try it once the current configuration has been running for 24 hours

Thanks

As this issue is not common, can you describe a bit more your setup? router brand, how many wifi devices are connected to it, etc etc.?

Sure. It's a Mikrotik hAP AC and I replicated with a Mikrotik wAP Access Point too. Unfortunately I don't have another vendor to do detailed testing but I had a similar experience at work with Tasmota and Aruba access points.

There's around 8 devices connected to the 2.4Ghz network, and its bridged to my LAN. The device doing the ARP and the Tasmota device are in the same subnet. The configuration is fairly average.

Although the device stops responding to ARP requests, the interaction with Home Assistant continues to function as expected, presumably because it still has an ARP entry.

Looking at ESPHome's code, seems that it is using sleep 0 by default. So, that correlates with your observation of sleep 0 of Tasmota and your other device with ESPHome.

https://github.com/esphome/esphome/blob/0f406c38ebe0c5c2e00bba412082c8a589101265/esphome/components/wifi/wifi_component_esp8266.cpp#L61-L76 :

bool WiFiComponent::wifi_apply_power_save_() {
  sleep_type_t power_save;
  switch (this->power_save_) {
    case WIFI_POWER_SAVE_LIGHT:
      power_save = LIGHT_SLEEP_T;
      break;
    case WIFI_POWER_SAVE_HIGH:
      power_save = MODEM_SLEEP_T;
      break;
    case WIFI_POWER_SAVE_NONE:
    default:
      power_save = NONE_SLEEP_T;
      break;
  }
  return wifi_set_sleep_type(power_save);
}

In that case, this ARP issue is an Arduino Core issue that it is shown only when using WIFI SLEEP. The actual default of DTIM is 3 in most routers' default, and that should allow the ESP8266 to be on sleep without having this issue.

So,

  • Seems that the SDK is not serving the ARP requests when waking up from WIFI SLEEP?
  • Or, that the router is not respecting the DTIM time for making the request and it is requesting the ARP while the device is sleeping?

Please, can you check your wifi parameters of your router? DTIM, lease time, etc etc.?

There must be something extra there that produces this issue. I have a similar setup as yours and never had an ARP issue with any value of sleep. That is why I suspect of DTIM or other parameter.

From Arduino Core, there is an explanation of DTIM and Sleep at: https://github.com/esp8266/Arduino/blob/2309a1c9cbbcfd1a29a1395ca06ad2af8a0d23d0/libraries/ESP8266WiFi/src/ESP8266WiFiGeneric.cpp#L266-L292

In your STATUS 0 it is shown "LinkCount":2,"Downtime":"0T00:00:11", so your device had wifi disconnections but your RSSI is ok.

  • Do you have fixed wifi channel at 6?
  • Do you have too many strong wifi signals nearby?
  • DTIM value and other wifi parameters?
  • Lease time ?

Unfortunately DTIM is not configurable on my Access Point. Some posts suggests its set to 1.

  • Do you have fixed wifi channel at 6?
    Yes, it's fixed to Channel 6
  • Do you have too many strong wifi signals nearby?
    It is a little congested. 6 is the clearest channel but I am in a block of flats with lots of SSIDs
  • DTIM value and other wifi parameters?

Export from router:

set [ find default-name=wlan1 ] adaptive-noise-immunity=ap-and-client-mode antenna-gain=3 band=2ghz-g/n bridge-mode=disabled country="united kingdom" disabled=no distance=indoors frequency=2437 \
    frequency-mode=regulatory-domain hw-retries=5 l2mtu=1530 mode=ap-bridge mtu=1530 multicast-helper=disabled name=BETA-2.4Ghz security-profile=BETA-2.4Ghz ssid=BETA-2.4Ghz vlan-id=130 vlan-mode=\
    use-tag wireless-protocol=802.11
  • Lease time ?
    DHCP Lease time is 1 hour

Thanks

Thanks for sharing all that info :+1:

After your test with sleep 0, if that goes fine, let's try the following test in order to reduce why it is causing that:

In Tasmota console, please type:

setoption60 1
sleep 50

Thanks.

Setoption60 changes the type of sleep:

https://github.com/arendst/Tasmota/blob/e8a135f6c0e1de6e22cccaf6cd52ddc47f478bdb/tasmota/support_wifi.ino#L153-L157

The devices continued to reply to ARP requests this morning so I can confirm sleep 0 is working.

I've now applied:

setoption60 1
sleep 50

I will report back this evening with the results.

Thanks for your assistance!

I'd say its better than before (Dynamic / 1) but still poor. I performed these tests within a few seconds each other and the results were significantly different:

sudo nping --arp-type ARP 10.0.130.117 -c 20
...
Max rtt: N/A | Min rtt: N/A | Avg rtt: N/A
Raw packets sent: 20 (840B) | Rcvd: 19 (532B) | Lost: 1 (5.00%)
sudo nping --arp-type ARP 10.0.130.117 -c 20
...
Max rtt: N/A | Min rtt: N/A | Avg rtt: N/A
Raw packets sent: 20 (840B) | Rcvd: 9 (252B) | Lost: 11 (55.00%)

Thanks

Ok, So, it is performing good only with sleep 0, any other option have some packets lost right?

I have bad performance with:

setoption60 0
sleep > 0

and:

setoption60 1
sleep 50

I have not tried Normal Sleep with 0, should I ?

setoption60 1
sleep 0

yes please, because the type of sleep the ESP8266 do, is controlled by setoption60

Thanks for all these tests :+1:


This is the output in my case:

$ sudo nping --arp-type ARP 192.168.1.22 -c 20

Starting Nping 0.6.47 ( http://nmap.org/nping ) at 2019-12-04 16:24 -03
SENT (0.1706s) ARP who has 192.168.1.22? Tell 192.168.1.101
RCVD (0.3497s) ARP reply 192.168.1.22 is at DC:4F:22:76:FA:98
SENT (1.1733s) ARP who has 192.168.1.22? Tell 192.168.1.101
RCVD (1.3997s) ARP reply 192.168.1.22 is at DC:4F:22:76:FA:98
SENT (2.1735s) ARP who has 192.168.1.22? Tell 192.168.1.101
RCVD (2.4497s) ARP reply 192.168.1.22 is at DC:4F:22:76:FA:98
...
...
Max rtt: N/A | Min rtt: N/A | Avg rtt: N/A
Raw packets sent: 20 (840B) | Rcvd: 20 (920B) | Lost: 0 (0.00%)
Nping done: 1 IP address pinged in 19.46 seconds

using the defaults: SETOPTION60 0 (Dynamic) and SLEEP 50 on Tasmota 7.1.1 of the precompiled bin.

@andrethomas @arendst @s-hadinger @Jason2866 @mike2nl

I can't reproduce this issue. Any idea why @marrold has this ARP issue if sleep is different than 0 ?

I can reproduce

Starting Nping 0.7.01 ( https://nmap.org/nping ) at 2019-12-04 21:47
SENT (0.8867s) ARP who has 192.168.42.7? Tell 192.168.44.1
SENT (1.8871s) ARP who has 192.168.42.7? Tell 192.168.44.1
SENT (2.8884s) ARP who has 192.168.42.7? Tell 192.168.44.1
SENT (3.8897s) ARP who has 192.168.42.7? Tell 192.168.44.1
SENT (4.8908s) ARP who has 192.168.42.7? Tell 192.168.44.1
SENT (5.8921s) ARP who has 192.168.42.7? Tell 192.168.44.1
SENT (6.8934s) ARP who has 192.168.42.7? Tell 192.168.44.1
SENT (7.8947s) ARP who has 192.168.42.7? Tell 192.168.44.1
SENT (8.8960s) ARP who has 192.168.42.7? Tell 192.168.44.1
SENT (9.8973s) ARP who has 192.168.42.7? Tell 192.168.44.1
SENT (10.8986s) ARP who has 192.168.42.7? Tell 192.168.44.1
RCVD (11.0690s) ARP reply 192.168.42.7 is at 60:01:94:B2:A1:04
SENT (11.9001s) ARP who has 192.168.42.7? Tell 192.168.44.1

Using Mikrotik AP's with Capsman - This is something I have observed very long ago already but always assumed it was the way Mikrotik is bridging the interfaces to form the Capsman (similar to Unify.)

Either way, it does eventually respond and I've not observed any other side effects such
as wifi or mqtt disconnects from the device side which is why I remained with the theory on ARP propagation with Mikrotik Capsman and not explored it any further.

This theory is further supported by the fact that my development/test wireless network (Using a stand-alone Trendnet wifi AP) does not exhibit this behaviour.

Initial tests of setoption60 1 / sleep 0 indicate it's reliably responding to ARP requests.

The response from @andrethomas suggests this could be related to Mikrotik access points. I have tested on 3 different Mikrotik models / chipsets with the same result. I've enabled debug logging but it doesn't appear to reveal any low level information.

My earlier assumption that I had the same issue on Aruba access points was _incorrect_.

If the problem with the arp expressed in such a way that it is not possible to open the webgui, then I have this problem for quite a long time. I had the feeling that it improved somewhat with Core 2.6.1. If I cant open the webgui, It helps by pinging the Esp. After a few pings it鈥檚 possible to connect to the webgui. I use a fritzbox 7490 and the current dev Build Core 2.6.1.

Sent with GitHawk

@kugelkopf123 Yes, I also experience that after some time the webui becomes inaccessible but that it recovers from this by pinging the device which eventually responds (usually after 10-20 seconds). I cannot, however, reproduce this on my Trendnet AP - I only experience this on my Mikrotik AP's.

I had this since when we were still running on 2.4.2 - It may also have been there on 2.3.0 but at that time I was still having a few devices so I did maybe just not notice it.

It is not uncommon for different behaviour from the different core versions - This has indeed been seen as other unrelated issues between different core versions working differently with different wireless equipment.

Interesting topic nevertheless.

Ok, so, as this ARP issue is related to mikrotik and the source of it is not the Arduino Core neither Tasmota, and as there is a workaround using SLEEP 0 to fast response to ARP requests, we can close this issue.

If someone finds any way in which Tasmota can help on this issue, please, do not hesitate on asking to reopen this issue.

Thanks everyone for helping on this issue and for testing. Very appreciated.

the source of it is not the Arduino Core

I don't think we've proven for definite that its not related to Arduino Core - this issue doesn't affect any other WiFi device connected to a Mikrotik AP and appears to be an interoperability issue between the two.

We are open to suggestions on what else we can test.

I've raised a support ticket with Mikrotik. If others experiencing the same issue could do so that would be appreciated. If they come back with anything I will update this issue.

Thanks for your help

Ok, I did a differential test.

If I PPTP/L2TP to the Mikrotik router I cannot reproduce this issue so to me its one of two possibilities.

This leads me to conclude that the Mikrotik router knows exactly how to reach the device.

The plot thickens...

@andrethomas I don't quit understand sorry. You mean you connected remotely to the access point and could reach your Tasmota device? It might be that it has an ARP entry from background DNS / NTP type traffic to the router?

@marrold Yes, a VPN connection from outside the home to the Mikrotik router that connects my home to the internet... albeit not the same one as the one which is maintaining the LAN so it was a routed connection through the one router and then obviously the Mikrotik maintaining the LAN so it seems the ARP on the Mikrotik itself is valid all the time which is why I could not reproduce it.

image

So by connecting this way I get immediate icmp responses whereas if I try it from a linux host on the home lan it times out for the first few pings and then starts getting responses.

So obviously the Mikrotik is not forgetting the ARP but somehow this is lost on the local LAN.

^ I've reopened an issue with core as I can reproduce this with minimal code.

Hi! I have the same issue with Mikrotik hap ac^2 and three tasmota tywe3s devices.
I will try to make one of the devices with static ip, another - with sleep 50 & SetOption60 0 and report the results over night. Also I have updated mtik's firmware to 6.46 to see if it helps also.

@Greefon As this thread, the workaround is sleep 0. Any other option won't make the arduino core to answer unsynced ARP requests from your Mikrotik Router.

@marrold

Hi,

Did you open a ticket to Mikrotik Support?

As now this issue is very well documented and easily reproduced, may be they will consider solving this unsynced ARP request. Mikrotik devices don't allow to change DTIM and seems that they also don't sync messages to sleeping devices neither.

So this issue now can show that DTIM or at least a sync of requests, is needed.

This issue can not be reproduced in TP-LINK routers for example. In those, the DTIM parameter can be changed, and the ESP8266 adapts itself to the new DTIM and this ARP issue never happens.

Yes, just now a send an email.
I installed the latest firmware of mikrotik and also set ARP ping for a night with 10 second delay for all three devices to see in what time they would go offline. I don't know if it was right decision for testing with constant arp pinging but in the morning all the devices were online and worked as expected.
The parameters as I said were sleep 50 & SetOption60 0

@marrold has created a post on the Mikrotik user forum - https://forum.mikrotik.com/viewtopic.php?f=2&t=154613

In the meantime, the Gratuitous ARP workaround @d-a-v has submitted in #6889 is probably the best solution for these misbehaving Access Points.

I have opened a ticket with Mikrotik support. They asked for more details so they're investigating at least. I also posted on the Forum, as did someone else - not sure if that's someone monitoring this thread?

Whilst I was investigating to provide more details on the issue opened with the Arduino core I discovered this is also affecting a Huawei and Samsung Mobile phone - it seems anything that sleeps is affected.

I can say with some certainty that this isn't a Tasmota / Arduino Core issue now. I will continue to try to troubleshoot the issue and follow up with Mikrotik but as @ascillato has mentioned I think the Gratuitous ARP is the best work around until Mikrotik fix it.

Out of curiosity, has anyone experienced this issue for a long time? One thing I've not yet done is revert my AP to an older version.

@Greefon are you saying you couldn't replicate the issue? Did you get 100% (or close to) responses to the ARPs overnight?

Out of curiosity, has anyone experienced this issue for a long time? One thing I've not yet done is revert my AP to an older version.

Yes, as indicated I noticed this behaviour a long time ago already.

The overnight test done by @Greefon would likely have 100% success because of the ARP is continuously being cached every time the test is done so it never has a chance to expire by exceeding its TTL on the machine that is doing the pinging.

Can you test this tasmota.zip tasmota build? It has PR https://github.com/esp8266/Arduino/pull/6889 in core 2.6.1 as backport included.
EDIT: Link deleted. Test version has a error and does send no Gratuitous ARPs
Sorry for this 馃槗

Thanks for testing !

Out of curiosity, has anyone experienced this issue for a long time? One thing I've not yet done is revert my AP to an older version.

Yes, as indicated I noticed this behaviour a long time ago already.

The overnight test done by @Greefon would likely have 100% success because of the ARP is continuously being cached every time the test is done so it never has a chance to expire by exceeding its TTL on the machine that is doing the pinging.

You were right. As soon as I have finished pinging and waited for couple of hours it stopped responding again.

Can you test this tasmota.zip tasmota build? It has PR esp8266/Arduino#6889 in core 2.6.1 as backport included.

Thanks! I also uploaded this firmware. I'll keep you up to date on how it works.

Program Version | 7.1.2.4(tasmota)
-- | --

I have the same issue and also created a post in the mikrotik forum.

I use the Sonoff T1EU2C as roller-shutter, therefore I can't directly overtake your binaries.
(at least in the past I had to edit the user_config_override.h)

ifndef USE_SHUTTER

define USE_SHUTTER // Add Shutter support (+6k code)

endif

I have four of the above mentioned devices, let me know if I should support you with testing.
(currently I set the sleep to 0 and hope that this will work until we have a better solution)

Thanks in advance!

BR
Michael

Unfortunately I must say that firmware 7.1.2.4(tasmota) in my case didn鈥檛 work. In a 2 hours the device stopped responding...

bad news, me too..
sleep 0 do not solve my issue. Four of four devices aren't pingable.
Will return to sleep 50.

@Greefon / @flmma2019 - ICMP ping is not the right tool to test this, you need to generate ARP requests or at least check your ARP table. Even better if you can get a packet capture. If sleep 0 doesn't fix it you probably have a different issue.

@Jason2866 / @d-a-v - I've flashed the provided bin but I don't see actually see any Gratuitous ARPs being broadcast?

@Greefon / @flmma2019 - ICMP ping is not the right tool to test this, you need to generate ARP requests or at least check your ARP table. Even better if you can get a packet capture. If sleep 0 doesn't fix it you probably have a different issue.

@Jason2866 / @d-a-v - I've flashed the provided bin but I don't see actually see any Gratuitous ARPs being broadcast?

I used arp-ping, not regular icmp

I used arp-ping, not regular icmp

Ok. For gratuitous ARPs you may just need to check you ARP table or a packet capture, but as I'm not seeing them broadcast I think there is an issue with 7.1.2.4 or the fix is included but not enabled.

I confirm that sleep 0 works
Maybe leave it be with this setup?

Just for my information. The tasmota command sleep 0, does it set wifi to WiFi.setSleepMode(WIFI_NONE_SLEEP) ?
If so, then please also try to measure the current consumption of the ESP module.
I've seen that the ESP will still enter low(er) power mode with this setting active.
But it may take some time. (and a lot of calls to delay())
As soon as the ESP will enter low(er) power mode, these issues with missing ARP packets will emerge.

How to measure the consumption? Directly to connect to ESP pins (it鈥檚 quite complicated) or by software? Actually both ways are not clear to me as I don鈥檛 know how to do it, so explain please

I had connected my ESP board to a power supply which shows the current consumption.

I have a finished motor with chip inside
918F61BD-7A6C-4F60-B52A-076E166E0AA2

Is it confirmed that https://github.com/esp8266/Arduino/pull/6889 doesn't send gratuitous packets ?

@d-a-v I flashed my device with @Jason2866 's bin file and couldn't see any gratuitous ARP packets whilst running a capture, but that's not to say there's not an issue with the compiled binary.

Well, Mikrotik support was not a big help yet, but my setup of 3 ESP8266 devices with settings "sleep 0" works ok for a three days already.

I was wondering about this ARP setting in the MikroTik config for the wireless interface: (it is there for any interface, but to me makes most sense to set it for this interface)
image

As far as I understand, the "local-proxy-arp" is mainly for using a router as bridge between 2 network segments for the same subnet.
An access point in AP bridge mode, does sound like it is just that.
But I was wondering, if it only acts as an ARP proxy for hosts connected on its WiFi, or will it act as a router for all hosts in the subnet?
If it is the first option, then it would be perfect for our purpose, but if it is the second option, then it would severely limit the bandwidth of my entire network as my MikroTik mAP is only 100 Mbps ethernet and also not equipped with serious fast routing capabilities. (apart from the fact you would then double the traffic on the network)

Does anyone know if this could be a useful setting for this issue?

I believe proxy-arp is only useful if you want the router to accept packets for devices on the same subnet but not on the same physical layer.

Well, the ESP nodes kind of are like that, right?
The nodes are on WiFi and something like the MQTT broker is probably on the LAN.
This would sound like proxy-arp could be usefull, as the AP then does send a new ARP request to its other physical layer and hopefully does cache it.
But I am not really sure about the usefulness of local-proxy-arp.
The only examples I could find was for communication on layer 3 between vlans, but it still sounds like communication between computers on the same VLAN would get multiple replies for an ARP request and thus randomly toggle between routing via the router or direct communication.

I'll do some tests over the weekend but I suspect you will find many many conflicting ARP responses. This is often seen in incorrectly configured wlan bridges.

But I was wondering, if it only acts as an ARP proxy for hosts connected on its WiFi, or will it act as a router for all hosts in the subnet?

So this is what happens when we use local-proxy-arp...

image

The AP's respond with their own MAC because they think the traffic should be routed through them so even the actual MAC of the ESP is different and this causes a potential IP conflict which is not helping the situation.

The normally expected operation should read like

image

On the PC side once it gets the MAC it stores it in the ARP cache on the PC so then it causes the PC to ask directly to that MAC instead of using the broadcast (As expected) which results in an immediate response from the ESP

image

If you remove the ARP cache entry for the IP address from the PC's ARP cache then it sends the request to the broadcast address (As expected) but it seems the ESP only starts responding to the broadcast request after several requests were sent:

image

So my thinking is that for some reason the ESP is not always catching the broadcast request for reasons I cannot explain.

So my thinking is that for some reason the ESP is not always catching the broadcast request for reasons I cannot explain.

If using SLEEP 0, you have the same issue?

@ascillato No, with sleep 0 it responds to the broadcast request without fail.

In my TP-LINK router, it responds to ARPs Request no matter the sleep value. So, this issue is Mikrotik related. Seems to be that TP-LINK stores the ARP/IP/DTIM settings in order to query ARP to devices that are on power saving mode, only when it know they are awake for the beacon.

I hope mikrotik can solve this ARP issue. I don't know what else we can do in order to ask mikrotik support to solve this. Other power saving devices also experience this.

In my TP-LINK router, it responds to ARPs Request no matter the sleep value. So, this issue is Mikrotik related.

Not only MikroTik, also Fritzbox. Those are the only 2 brands I have here in my network.

Just an entry to flag https://github.com/arendst/Tasmota/issues/1553 as a related issue.

This is my post on the Mikrotik Form

I have an open support ticket but other than an initial response asking which device was affected, I've not heard anything. I'll update if I hear anything

@ascillato out of curiosity, is your TP-Link running stock or custom firmware?

Thanks

No ARP issue using Stock firmware in TP LINK - TL-WR740N

The advice from the forum and Mikrotik Support is to set multicast-helper to full - I have always assumed this was a workaround rather than a fix, but someone on the forum post above linked to documents from several vendors (Including Aruba, TP-Link) stating that they enable it by default.

I have now configured multicast-helper=full and can confirm I reliably receive ARP replies from the affected Tasmota and Android devices.

I guess the mystery is solved, but it would be good to get the Gratuitous ARP change tested as perhaps not everyone will be able to change this setting or have access to the Access Point Config.

Thanks

I guess the mystery is solved, but it would be good to get the Gratuitous ARP change tested as perhaps not everyone will be able to change this setting or have access to the Access Point Config.

Yes,I have a Apple AirportExtreme and a Fritzbox7490 both arent able to set those settings.

I guess the mystery is solved, but it would be good to get the Gratuitous ARP change tested as perhaps not everyone will be able to change this setting or have access to the Access Point Config.

But this does not solve the issue of not able to route any (layer 3?) traffic for a while when you change to another AP.
To me that sounds also a bit as an ARP problem, but it may be unrelated to esp8266/Arduino. (not sure yet)

@kugelkopf123 Do you have any Multicast settings?

@marrold Good stuff getting that from Mikrotik - Made the setting on mine and it works as expected now, Thanks :)

@TD-er I think the issue you're referring to when WiFi roaming is a fundamental networking challenge with client / vendor specific work arounds rather than anything Arduino / Mikrotik specific.

@andrethomas Glad it's working. I guess we should document it somewhere?

@TD-er

But this does not solve the issue of not able to route any (layer 3?) traffic for a while when you change to another AP.

In my case, I changed the settings on the capsman interface and I've tested roaming as well as turning AP's off to force all devices over the one of the AP's and back to another and it seems to handle it just fine.

I am still a bit perplexed on the naming convention Mikrotik used for this since I know I read over that flag originally when trying to find some ARP related setting that could explain the behaviour and dismissed the multicast-helper setting flag on the Mikrotik wireless configuration to perhaps be related to multicast as in the traditional sense of the term and I do not use multicast for anything.

Having said this, I am still not clear how the router/AP is doing anything different because from the perspective of wireshark it looks exactly the same as I found on my Trendnet test AP which shows that the response is not coming from the router at all but is coming from the actual device as intended.

image

So yes, although this solves it for Mikrotik based systems I am not sure that the naming convention is correct and leads me to another question... what is the naming convention for this on the other AP's we're see'ing reports of this such as fritzbox for example.

Either way, I do not think it relates to proxy-arp.

MikroTik's multicast-helper is converting multicast into unicast.
So what it does is not just broadcast multicast packets once, but addressing them to separate hosts on the wifi network (assuming it is set on the wifi interface in the mikrotik)
So it does cause some overhead, which does take more bandwidth, but it does make sure the packets are received by wireless nodes.

An ARP request is multicast traffic, (N.B. its answer is unicast) so enabling this does help ARP issues.

proxy-arp is something different and it was just something I was wondering about, but as far as I know now, it will cause more harm than good, as we do not route the traffic to another subnet, nor run on separate VLANs.

@marrold

@TD-er I think the issue you're referring to when WiFi roaming is a fundamental networking challenge with client / vendor specific work arounds rather than anything Arduino / Mikrotik specific.

Just to be sure you know.
I am not using Tasmota, but am the developer of ESPEasy, so the way how I do things differ from how it is done in Tasmota.
In ESPEasy the network layer does try a few networks, when some appear to be unstable.
This means it will actively disconnect and reconnect to a different AP. This is not roaming, but just like when you switch AP's on your mobile.
I use event based wifi, so the response to changes is rather quick.
I assume Tasmota does it differently, so a change to another AP may take quite a bit more time meaning the ARP cache in the switches may already have purged the entry of the node that's reconnecting. (typical ARP TTL is 30 seconds)
That's also just about the same time I see that a node may not be able to send or receive traffic. (sometimes even more, but then there may be more switches in between the APs)

Understand now what you mean by roaming... for me, roaming does not require the client to reconnect to another AP - the handing over of the client device from one AP to another is done by the access point management system on the main Mikrotik router... so in theory, this is completely transparent from a client device's perspective.

Tasmota can behave in a similar way to what you explain and this will still surely cause the same issues stemming from the ARP TTL perspective.

Edit: I do not observe any additional overhead after enabling multicast-helper

Edit: I do not observe any additional overhead after enabling multicast-helper

If you would experience overhead, then it would be noticeable with multiple nodes connected to the same AP (and all interested in this multicast traffic)
The general idea of multicast is to send out the data once and it will be received by many.
When the AP does convert it to unicast, the same (small) packet has to be sent again for any host that wants to receive it and maybe the packet has to be re-sent a number of times if the client does not receive it.
But the number of ARP requests are not really a lot and also the packets are not big.
So the overhead is merely theoretical in a domestic setting.

It may be different if you stream multicast data other than ARP packets. Such traffic will also be affected by this setting.

A big use case for mulitcast trafffic is IP-TV. Without it wouldnt work and flood network (s).

I do not have the Multicast package installed on the Mikrotik - Hence I don't think its related to conventional multicast... which is why I question the naming convention.

If mikrotik is really converting all multicast to unicast traffic it is a MESS!
Anyway the naming is bad.

If mikrotik is really converting all multicast to unicast traffic it is a MESS!
Anyway the naming is bad.

It is supposed to change it only on the interface for which you set it.
For example on the WLAN interface, should only translate it to unicast.
And as far as I know, you have to "subscribe" (not the correct term, but cannot remember the correct one) or else it would be broadcast to any port on any switch on your network and that's also not what you want.

Not sure if it is also a limitation on this multicast setting, but I've seen that some options in MikroTik firmware are only available on license level N and up (e.g. level 4 for local-proxy-arp)

If mikrotik is really converting all multicast to unicast traffic it is a MESS!

Check the thread linked above. I thought the same but seems the other major vendors are doing it one way or another, TP-Link, Aruba, Ubiquiti to name a few.

Anyway the naming is bad.

Welcome to Mikrotik.

Was curious if there is a setting in my Lancom Accesspoints. I think i have no problems
because there is a option called Arp treatment and this option is set default to on.
After a search i found this explanation from Lancom for this option
If a station in the LAN wants to establish a connection to a station in the WLAN that is in energy-saving mode, this often either does not work at all or only with great delays. The reason is that broadcast delivery, e.g. ARP requests to stations located in Powersave cannot be guaranteed by the base station. If you switch on the ARP treatment, the base station answers ARP queries for the stations it has registered with itself and thus more reliably in such cases. Note: As of LCOS version 8.00, this switch activates an analog treatment for IPv6 neighbor solicitations.

the base station answers ARP queries for the stations it has registered

That would be nice!

Whilst you're active @Jason2866, I couldn't see any Gratuitous ARPs being sent from your bin with fix applied?

@marrold you are right, tbh i dont know why. I must overlook a simple thing.

@kugelkopf123 Do you have any Multicast settings?

No Nothing. In the Fritbox interface i have the option "optimize for WebTV". I tried that, but it makes no difference.

optimize for WebTV

That sounds like IGMP snooping

And then, of course, there's always this idea:

https://gist.github.com/SupraJames/779475fefb6dfe7af315a68f03fe63dd

@kugelkopf123 Sorry, there is no solution from Tasmota side. That is the reason why this issue is
closed. It is a general problem in a wifi environment where devices are used which are using powersave functions. To solve this the wifi equipment has to take care.
If there is no option (for this) in your Fritzbox the only way to get it solved is to open a ticket at AVM

@andrethomas Have you had any problems since enabling multicast-helper ? When it's enabled I seem to have stability issues with _all_ connected clients

@marrold No issues which I have noticed except on one nodemcu v3 board (https://www.banggood.com/NodeMCU-V3-340G-Lua-WIFI-Module-Integration-Of-ESP8266-Extra-Memory-32M-Flash-p-1175347.html) but that is most likely because of the specific board rather than the wifi setting... this particular device does not like 2.6.1 but runs fine on core 2.4.2 - it has thermal design issues I believe.

And then there's the funny aspect of this particular board - It is powered from the OTG port on an RB2011UiAS-2HnD-IN and resides right next to it on the wall in the corner of the dining room so it is officially the closest to any wifi AP that any of the devices are.

Something I still have not checked is how the wifi setting impacts on the power consumption of the ESP8266 - I suspect it is running a bit hotter than usual but just a guess - still need to find the time to perform further analysis.

Was this page helpful?
0 / 5 - 0 ratings