Espeasy: [Rules] Parsing of [task#var] syntax changed since 20190928

Created on 11 Oct 2019  路  62Comments  路  Source: letscontrolit/ESPEasy

See https://github.com/letscontrolit/ESPEasy/issues/1625#issuecomment-541114010

Quote from @thomastech

I have rule that publishes the Nextion's IDX data to my Openhab2 broker. When it sends this:

Publish /%sysname%/NEXTION/idx,[NEXTION#idx]

It sends it twice, first with idx NULL data, then ~60mS later is sends it again with valid idx data. Just to be clear, this rule should only publish the data once, not twice.

My OpenHab log shows this:

2019-10-11 08:17:23.396 [WARN ] [b.core.events.EventPublisherDelegate] - given new state is NULL, couldn't post update for 'NextionIDX'
2019-10-11 08:17:23.462 [vent.ItemStateChangedEvent] - NextionIDX changed from 17 to 26

The problem is averted if I do this:

let,10,[NEXTION#idx]
Publish /%sysname%/NEXTION/idx,[VAR#10]

The previous release (ESP_Easy_mega-20190928_test_core_260_sdk222_alpha_ESP8266_4M1M) does not have this problem. The change log on the new release mentions several changes to the rules parser. Seems to me those changes have broken something.

Rules Bug

All 62 comments

@thomastech Can you give a bit more context on the rules where this is being used?
For example the on ... do trigger before your publish line.

Thanks for creating the issue ticket.
This is the rule:

on NEXTION#idx do
  if [NEXTION#idx]>=10 and [NEXTION#idx]<=30
      Publish /%sysname%/NEXTION/idx,[NEXTION#idx]
  endif
endon

Can you also try to write the value to the log using something like this:

logentry,nextionvalue_[NEXTION#idx]_

Just to see if the formatting of the value is a bit strange, I padded the example with a character.

Added the logentry.

Nothing unusual appears in weblog. No sign of the NULL data either, but the naughty NULL is being published.

Here's a snippet of the log:

60270: EVENT: FreeRam#ram=17880.00
61324: EVENT: Clock#Time=Fri,16:32
61871: EVENT: DRYERAX#ax=400.00
61972: EVENT: DRYERAY#ay=172.00
62072: EVENT: DRYERAZ#az=328.00
62186: SYS : 17656.00
62188: EVENT: FreeRam#ram=17656.00
63755: EVENT: NEXTION#idx=26.00
63779: ACT : Publish /ESPEZ_Laundry/NEXTION/idx,26
63788: Command: publish
63792: ACT : logentry,nextionvalue_26_
63800: Command: logentry
63800: logentry,nextionvalue_26_
63872: EVENT: NEXTION#value=0.00
64017: EVENT: DRYERAX#ax=400.00
64116: EVENT: DRYERAY#ay=172.00
64242: EVENT: DRYERAZ#az=328.00
64355: SYS : 17272.00
64357: EVENT: FreeRam#ram=17272.00
65871: EVENT: DRYERAX#ax=400.00
65996: EVENT: DRYERAY#ay=172.00
66093: EVENT: DRYERAZ#az=328.00
66205: SYS : 17536.00
66206: EVENT: FreeRam#ram=17536.00
66550: SW : State 0.00
66550: EVENT: WASHER#ac=0.00
66564: ACT : NEXTION,page0.va_WasherAC.val=0
67858: EVENT: DRYERAX#ax=452.00
67962: EVENT: DRYERAY#ay=236.00
68058: EVENT: DRYERAZ#az=360.00
68172: SYS : 17520.00
68173: EVENT: FreeRam#ram=17520.00
69758: SYS : 1.00
69760: EVENT: RUNTIME#days=0.00
69862: SYS : 53.20
69864: EVENT: SYSLOAD#load=53.20
69961: EVENT: DRYERAX#ax=452.00
70063: EVENT: DRYERAY#ay=236.00
70159: EVENT: DRYERAZ#az=360.00
70296: SYS : 17352.00
70297: EVENT: FreeRam#ram=17352.00
71393: WD : Uptime 1 ConnectFailures 0 FreeMem 17416 WiFiStatus 3
71871: EVENT: DRYERAX#ax=316.00
71973: EVENT: DRYERAY#ay=228.00
72070: EVENT: DRYERAZ#az=384.00
72188: SYS : 17488.00
72190: EVENT: FreeRam#ram=17488.00
73841: ACT : logentry,nextionvalue_16_
73849: Command: logentry
73849: logentry,nextionvalue_16_
73921: EVENT: NEXTION#value=0.00
74063: EVENT: DRYERAX#ax=316.00
74165: EVENT: DRYERAY#ay=228.00
74263: EVENT: DRYERAZ#az=384.00
74375: SYS : 17224.00
74377: EVENT: FreeRam#ram=17224.00
75984: EVENT: DRYERAY#ay=228.00
76080: EVENT: DRYERAZ#az=384.00
76195: SYS : 17488.00
  • Thomas

OK, thanks for testing.
I think something is going on regarding the changes I made in the parsing of numerical strings.
My intention was to make it a bit more tolerant, but I think I may have caused some other issue in transforming it back to a string.

I will now start sleeping on it. (it's almost 2am here)

I recognize it's a late hour for you. Put down the keyboard, turn off the monitor, and get some rest.

  • Thomas

I think this one is related to #2660 and #2681
Can you test if this test build does fix this issue?
See also this PR: https://github.com/letscontrolit/ESPEasy/pull/2667 (test build is from last night and PR has a lot more fixes since then, but nothing of the later commits are related to this issue)

@TD-er: Unfortunately ESP_Easy_mega-20191016-11-PR_2667_test_ESP8266_4M_VCC has the problem. Sorry for delivering bad news.

The last "good" version is ESP_Easy_mega-20190928_test_core_260_sdk222_alpha_ESP8266_4M1M.

  • Thomas

Thanks for the update.
I am now looking at the publish command.
It does not use the parse function to collect the parameters.
So that should be changed, but I wonder if that's the complete fix here.

I got to go now, so no time to make a full test build.
Can you test for yourself using these changes in src/src/Commands/MQTT.h
This is a test build I made to confirm the changes do at least compile.

String Command_MQTT_Publish(struct EventStruct *event, const char* Line)
{
  if (WiFiConnected()) {
    // ToDo TD-er: Not sure about this function, but at least it sends to an existing MQTTclient
    int enabledMqttController = firstEnabledMQTTController();
    if (enabledMqttController >= 0) {
      // Command structure:  Publish,<topic>,<value>
      String topic = parseStringKeepCase(Line, 1);
      String value = parseStringKeepCase(Line, 2);
      if (topic.length() > 0 && value.length() > 0) {
        //@giig1967g: if payload starts with '=' then treat it as a Formula end evaluate accordingly
        if (value.c_str()[0]!='=')
          MQTTpublish(enabledMqttController, topic.c_str(), value.c_str(), Settings.MQTTRetainFlag);
        else
          MQTTpublish(enabledMqttController, topic.c_str(), String(event->Par2).c_str(), Settings.MQTTRetainFlag);
      }
      return return_command_success();
    }
    return F("No MQTT controller enabled");
  }
  return return_not_connected();
}

This on top of this PR: https://github.com/letscontrolit/ESPEasy/pull/2667

@TD-er, I won't have a chance to do a custom build for awhile (buried in work projects). So I'll have to wait for a 4M test build I can upload.

  • Thomas

"Test build" as in the "testing" build? (in need of Nextion plugin for example?)
Or is the build I linked also enough to test it?

I can build one for you in about an hour.

"Test build" as in the "testing" build? (in need of Nextion plugin for example?)

Yup, "Testing" Build. I use Nextion and other "testing" plugins.

Or is the build I linked also enough to test it?

I did try that one. The device list was not happy at all.

  • Thomas

I did try that one. The device list was not happy at all.

And if the device list is not happy, Thomas is also not happy?

And if the device list is not happy, Thomas is also not happy?

It's Friday! Hard not to be happy.

@thomastech

I posted this in another issue (#2660) and there it seemed to be fixed for that issue at least.
Can you try custom_ESP8266_4M1M_issue2660_Nextion.bin this one?

It has at least the Nextion included.

Can you try custom_ESP8266_4M1M_issue2660_Nextion.bin this one?

That one is very promising. I don't see the NULL issue now. But plugin 45 (Gyro MPU 6050) is missing, so not all rules are executing.

  • Thomas

Can you test this build ?
Not all test builds are included, but at least one is.

Looking good; Build ESP_Easy_mega-20191016-19-PR_2667_test_ESP8266_4M_VCC does not cause the NULL issue. Thanks for fixing it.

  • Thomas

Thanks for testing it :)

Now comes the hard work, to find out what (open) issues may have been related to this bug in parsing the parameters in a command.

I'm seeing a increase in hardware resets on both of my NodeMCU's using the Build ESP_Easy_mega-20191016-19-PR_2667_test_ESP8266_4M_VCC.

To see if these resets are caused by the new PR I have reverted back to ESP_Easy_mega-20190928_test_core_260_sdk222_alpha_ESP8266_4M1M. It had been running with better stability, so I'll make a comparison.

  • Thomas

What kind of resets?

I also found an issue with the WiFi (stupid typo of me)

void markGotIP() {
  lastGetIPmoment = millis();
  wifiStatus      != ESPEASY_WIFI_GOT_IP;
  processedGotIP = false;
}

Change != into |=

What kind of resets?

System Info Reset Reason reported hardware reset. I did not check serial log's boot messages .

  • Thomas

24 hour test results using two NodeMCU's:
No reboots with ESP_Easy_mega-20190928_test_core_260_sdk222_alpha_ESP8266_4M1M.
Same period with ESP_Easy_mega-20191016-19-PR_2667_test_ESP8266_4M_VCC had several reboots.

If you think the != typo is responsible then I can test again with a new bin. I don't use VCC function, so no need to add it to the testing compile.

  • Thomas

I found a few other issues, so I don't think it is just the WiFi.

Can you try this test build: ESPEasy_mega-20191028-21-PR_2698.zip
Please let me know if it does work and it would be great if you could do some more thorough testing so I can merge it for tomorrow's build.

Not sure if it does fix your "resets", since I don't have a good idea what is causing them.
Could they be related to this issue: https://github.com/letscontrolit/ESPEasy/issues/2684 ?

@TD-er I installed ESP_Easy_mega-20191028-21-PR_2698_test_core_260_sdk222_alpha_ESP8266_4M1M on two NodeMCU's. I don't see the NULL problem. I will let them run and report back late today.

  • Thomas

Ten hours on the clock, no reboots yet. I'll report again in the morning (PST).

  • Thomas

24 hours, no reboots.

  • Thomas

And what was the stability of the 20190928_test_core_260_sdk222_alpha_ESP8266_4M1M build you referred to?

Did you also see other 'strange' issues with rules handling?

Status update: 36 hours, no reboots.

And what was the stability of the 20190928_test_core_260_sdk222_alpha_ESP8266_4M1M build you referred to?

No reboots during a three week test period.

Did you also see other 'strange' issues with rules handling?

None noticed while running ESP_Easy_mega-20191028-21-PR_2698_test_core_260_sdk222_alpha_ESP8266_4M1M.

  • Thomas

Any updates on this one?
Last night's build has also a few fixes in rules handling, but no need to update if this one of you is on a winning streak of breaking the uptime record :)

Any updates on this one?

Both test units still running, no reboots. 5+ days so far.

Last night's build has also a few fixes in rules handling, but no need to update if this one of you is on a winning streak of breaking the uptime record :)

I'll update after the next reboot. Looks like that may take awhile. :)

  • Thomas

I'll update after the next reboot. Looks like that may take awhile. :)

OK, see you in 42 days 0 hours 21 minutes (or more)
That's the uptime of my weather station (running a lot of tasks out there in the rain)

42 days and counting is impressive. Let's see if I can beat that.

  • Thomas

This is the uptime of that weather station.
laatste-jaar
I do think we made some progress in stability in the last few months :)
The last "reboot" was due to a firmware update.

I made it the official wallpaper for the upcoming GUI release 馃槈

Screenshot_20191105_090106_com android chrome

I do think we made some progress in stability in the last few months :)

Yes! Thank you for taming the reboot demons.

I have several Sonoff TH10 modules that need to be updated with the latest firmware. Some are in very inconvenient locations (attic, foundation crawl space), so updates are troublesome. So choosing the best update bin is important. With that in mind, for best stability on the Sonoffs do you recommend _normal_ESP8266_1M_ or _normal_core_241_ESP8266_1M_?

  • Thomas

I think the best is not even included in the build.
Core 2.6.0 does have a significant number of fixes which all contribute to the gained stability.
I will add 1M core 2.6.0 builds to the selection.

I will add 1M core 2.6.0 builds to the selection.

Thanks, I appreciate it.

  • Thomas

I just made a test build for you, where I changed the "regular" definition in platformio.ini into "core_stage", which is core 2.6.0
So all builds should be with core 2.6.0, unless explicitly stated otherwise.

Please try it first on a well reachable node :)

Thanks for the special build. I installed ESP_Easy_mega-20191104_normal_ESP8266_1M on my attic Sonoff Basic. Sysinfo reports ESP82xx Core 2.6.0-dev stage.

I'll let it run for a day before flashing more Sonoff devices. If anything unusual appears I will let you know.

Edit: I also updated a Sonoff TH10. So two devices are testing the new firmware.

  • Thomas

After 6 days 9 hours one of my NodeMCU devices did a reboot today. So a few minutes ago I updated it from _ESP_Easy_mega-20191028-21-PR_2698_test_core_260_sdk222_alpha_ESP_ to _ESP_Easy_mega-20191104_test_core_260_sdk222_alpha_ESP8266_4M1M_.

So far I haven't won the 42 day lottery. But I'm feeling lucky, so maybe this time I'll hit the jackpot.

  • Thomas

Let's hope so :)
As my smart watch always tries to tells me, it takes 2 days to start a streak :)

Status Update:
My ESP_Easy_mega-20191104_test_core_260_sdk222_alpha_ESP8266_4M1M test device rebooted today. Total run time was about 3 days.

Sysinfo Report:
Boot: Manual reboot (1)
Reset Reason: Software/System restart

  • Thomas

Hmm too bad :(

Can you test the testbuild linked here: https://github.com/letscontrolit/ESPEasy/pull/2728#issuecomment-553169609 ?

No problem. A couple minutes ago I installed ESP_Easy_mega-20191108-36-PR_2728_test_core_260_sdk222_alpha_ESP8266_4M1M on a NodeMCU board (11-12-2019 timestamp).

  • Thomas

Great!
I have the feeling the MQTT part and network connection timeouts are the last hurdles to take for near-infinite uptimes :)

14+ hours so far and no reboot. The little NodeMCU is doing its best to achieve your near-infinite uptime goal.

  • Thomas

40+ hours. Still running.

  • Thomas

Four days, still running.

  • Thomas

6 days 17 hours. Still running.

  • Thomas

Still running, 8 days 14 hours so far. Today I flashed a second NodeMCU to expand the test.

Any chance you have a Sonoff version (PR_2728_normal_core_260_sdk222_alpha_ESP8266_1M) I can test too?

  • Thomas

Well, that PR has already been merged, so maybe you can also test today's latest (which will be tomorrows nightly build)

Edit:
Maybe you can test a core 2.6.1 SDK3 build.

Will the pending build have a 2.6.1 SDK3 build for Normal Core ESP8266 1M?

  • Thomas

Not yet, but if you like one, I may add it.

Great. I can test the latest changes on one of my Sonoffs (normal ESP8266 1M). So please add whichever build (core 2.6.0 sdk222, 2.6.1 sdk3, etc.) you feel is best to test.

  • Thomas

I will mix up a blend of builds for you tomorrow.
Right now, I don't think it is wise to start changing code on a computer where I cannot build and test. (building on this Chromebook takes forever)
Maybe you can see some untested builds in tomorrow's build (around 5 am local time, so in about 4 hours)
The thing is, SDK3 appears to be able to connect with more ease without reboots on WiFi AP with bad RSSI.

So that's why it might be nice to test it too.

Not sure why, but the minimal 1M OTA builds for core 2.6.1 SDK3 are small enough when I compile them in Windows, but they do not appear in the nightly build, due to build size in Travis:

FAILED: too large 618080 > 614384 Bytes. (minimal_core_261_ESP8266_1M_OTA)
FAILED: too large 618080 > 614384 Bytes. (minimal_core_261_ESP8285_1M_OTA)
FAILED: too large 616720 > 614384 Bytes. (minimal_core_261_sdk3_ESP8266_1M_OTA)
FAILED: too large 616720 > 614384 Bytes. (minimal_core_261_sdk3_ESP8285_1M_OTA)

Built on Windows:

Building .pio\build\minimal_core_261_sdk3_ESP8266_1M_OTA\firmware.bin
Advanced Memory Usage is available via "PlatformIO Home > Project Inspect"
RAM:   [=====     ]  50.5% (used 41356 bytes from 81920 bytes)
Flash: [=======   ]  68.6% (used 612884 bytes from 892912 bytes)

That's 4k difference.

I added this one: https://github.com/letscontrolit/ESPEasy/pull/2765
Or do you also need a ESP8285 one?

I have several Sonoffs. Only one is ESP8285. For this test I will use a ESP8266 Sonoff.

  • Thomas

Yesterday I installed ESP_Easy_mega-20191108-36-PR_2728_test_core_260_sdk222_alpha_ESP on a second NodeMCU. It rebooted twice earlier today, then a few minutes ago browser access stopped responding. Required a cold reboot to get it back.

EDIT: I can test the SDK3 bin on it. But a testing version with Plugins 45 and 75 is not provided in the latest PR.

  • Thomas
Was this page helpful?
0 / 5 - 0 ratings