I have a code for a home meteo which worked fine on march dev build (march 10), but since april releases (apr 13 and above) it reboots on wdt when using ucg with ili9341_18x240x320 display. I have changed firmware back and forth several times.
Output on new builds is:
ets Jan 8 2013,rst cause:4, boot mode:(3,6)
wdt reset
load 0x40100000, len 28600, room 16
tail 8
chksum 0x22
load 0x3ffe8000, len 2592, room 0
tail 0
chksum 0x4a
load 0x3ffe8a20, len 8, room 8
tail 0
chksum 0x3b
csum 0x3b
What can this message mean?
I cant reproduce situation with simple example, sorry
The message means that your Lua code keeps the CPU busy for too long so the watchdog timer resets the esp.
But this code is working well in previous builds (based on sdk 2.2.1).
SDK 3.0 moved a lot of constant data from RAM to Flash ROM. The upside was that this move has freed about 18Kb RAM, but one consequence is that we had reported some runtime performance issues with the extra unaligned exception handler overhead within the SDK code. (See espressif/ESP8266_NONOS_SDK#233). You may be being impacted by this. See our FAQ for more discussion.
Can wd timer be adjusted to avoid cases like mine? Q
No. It's an SDK function. You can make your individual Lua tasks shorter by breaking up the processing, or issue the tmr.wdclr() API function. Read the FAQ.
Something wrong is happening.
I have a module 'dispDraw.lua' https://yadi.sk/d/fsK1UjiXstDBjw in my project.
wdt fires in my code 2 times:
node.compile('dispDraw.lua')
and
local d = require('dispDraw')
I have put tmr.wdclr() before each of these instructions. Nothing happend.
In previous builds all goes well.
Egor, this is more of a #1010 issue and really outside the scope of this list, but I will give you a brief response. You need to read my FAQ. The non-OS SDK is not preemptive. Individual task executions are recommended to be less than 50 mSec to allow other tasks such as the SDK ones which service the WiFi stack to fire. If your tasks are much longer than this then you will start to have WiFi timeouts and failures.
The software watchdog is a safety backstop and fires after a task has been running for about 3 seconds. You can 'feed' the timer using the wdclr() timer call which will stop it firing but this won't fix any consequential network time-outs. You should really break up your processing into smaller compute chunks.
Functions like d.DrawValues() can loop over a lot of data and do a lot of processing per element which could all aggregate up to trip the timer.
The latest release has moved to SDK 3.0 and the graphics libraries have gone through a version upgrade.
@devsaurus Arnim, maybe we (and in this case you) should do some quick benchmarks to get an idea of how much slower SDK 3.0 is on graphics tests. I can do some on general tests. I've always thought that this is one area where coroutining might help simplify coding.
Thanx, Terry.
But I cant even call any GUI routing from my module :)
My project requires a free RAM between GUI works, so I compile all avaiable lua modules before starting and load them dinamically as I need (once in 5 sec or so)
I have a code like below and cant run trough the first line ( require('dispDraw') ) - wdt is firing
And this code runs fine from ESplorer but not from the main program :(
local d = require('dispDraw')
if d then
d.ClearScreen()
end
d = nil
package.loaded['dispDraw'] = nil
Use LFS
Thanx!
@TerryE Terry Egor is also stating that
''''node.compile('dispDraw.lua')'''
Triggers the WDT.
So there is no graphics involved.
Using LFS sure will help but might also hide an underlying performance problem.
Maybe it is connected to string handling or something else.
I feel that some benchmarking might help here.
@HHHartmann, thank you for reply!
It is exactly the same as you have mentioned.
I've got a little chunk from my project that illustrates my problem.
As I wrote above the first task in my code is to compile all of lua files to optimize RAM usage.
It make DoCompile.lua routine that checks any uncompiled files. It is getting stuck on file dispDraw.lua (wdt fires) on fresh builds based on SDK 3.0 while goes right on old builds.
If it matters I can attach whole project but I think that this chunk will be enough to show my trouble.
Thanks in advance!
https://yadi.sk/d/MEC8SGQEmHYEdA
Here example, I cant attach it with my browser, sorry
OK Egor, I have tried your example on both the latest dev and the 7 Dec master releases. It fails compile on both -- with a WD timeout and E:M 520 not enough memory respectively. In order to get the module small enough to compile on 2.2.1, I cut the array initialisations for colors, info_panels, var_labels, var_limits and the function (now local) var_labels_text into a separate dispDrawSub.lua file where the last statement was
return colors, info_panels, var_labels, var_limits, var_labels_text
and in the main `dispDraw.lua file, I replaced these by
local colors, info_panels, var_labels, var_limits
colors, info_panels, var_labels, var_limits, var_labels_text = dofile 'dispDrawSub.lua'
```
Note that the `var_labels_text` is not a local so that this variable is stored in the environment becoming a module method. This could also be replaced by a compiled `lc` version.
This is all standard ESP memory optimisation techniques with no runtime overhead. In essence what this does is to take the bulk of the array initialision code (which generates a lot of VM instructions that are only used once at startup) out into a separate initialisation function which is called and then GCed.
OK so now the main module is small enough to compile on SDK 2.2.1 builds. I did my usual trick on both of adding a small LFS image with `_init`, `dummy_strings` and `ftpserver`; this last enables me to provision the ESP by drag and drop from my laptop. Anyway the results are:
```Lua
NodeMCU 2.2.0.0 build unspecified powered by Lua 5.1.4 on SDK 2.2.1(6ab97e9)
> t=tmr.now();node.compile'dispDraw.lua';return tmr.now()-t
496441
> =node.heap()
44168
and for SDK 3.0:
NodeMCU 2.2.0.0 build unspecified powered by Lua 5.1.4 on SDK 3.0.0(d49923c)
> t=tmr.now();node.compile'dispDraw.lua';return tmr.now()-t
315536
> =node.heap()
58464
>
So when the module is arranged to be ESP compatible, the compile takes 316 mSec on an SDK 3.0 build and almost 50% longer at 496 mSec on SDK 2.2.1. (I suspect that the difference is as a result of the extra RAM headroom on the new build so the GC doesn't have to work so hard.)
These results just don't square with your original post.
As a footnote, I've tried doing multiple compiles on each and the times are all over the place. Not sure why but there is no systematic difference between the compile times on the two build versions
PS. It is writing to SPIFFS that takes the time, as the SPIFFS garbage collector is kicking in and this is _slow_. If you do a loadfile() instead of a node.compile() to compile into RAM then the run-times are very consistent: 158 mSec for 2.2.1 and 173 mSec for 3.0 with a std dev of under ½ mSec. So in this case running under 3.0 is about 10% slower, which is acceptable given the extra memory, I feel.
I have used 2 builds to try my initial code:
https://yadi.sk/d/XcEEPV7umUz-7w - nodemcu-dev-19-modules-2019-03-10-18-18-49-integer - works good
https://yadi.sk/d/ZHfDnWNucdWVNA - nodemcu-dev-19-modules-2019-05-03-04-13-28-integer - wdt fires
Thank you for your usefull advices, I'll try it a little later and give feedback.
If you do a loadfile() instead of a node.compile() to compile into RAM
I compile all of the modules once, not every time, and remove .lua files after compile (see doCompile.lua) and then use only .lc files (function DO(s) in init.lua).
Upd. Anyway I'll try to split big files as you advised and give feedback later.
@TerryE, I have results, not good for me still...
I have rewrote dispDraw splitted it into 2 files (see attachment).
I have included some debug output across this 2 files
I can compile now my project in my way, it's good.
But if I try to do
local d = require('dispDraw')
I have such output
dispDraw begins1
dispDrawSub begins1ets Jan 8 2013,rst cause:4, boot mode:(3,6)
wdt reset
load 0x40100000, len 28600, room 16
tail 8
chksum 0x22
load 0x3ffe8000, len 2592, room 0
tail 0
chksum 0x4a
load 0x3ffe8a20, len 8, room 8
tail 0
chksum 0x3b
csum 0x3b
so I can see that local colors = {
BLACK = {r=0, g=0, b=0}
,WHITE = {r=255, g=255, b=255}
,RED = {r=255, g=0, b=0}
,RED1 = {r=240, g=0, b=0}
,GREEN = {r=0, g=255, b=0}
,DARKGREEN = {r=0, g=220, b=0}
,BLUE = {r=0, g=0, b=255}
,YELLOW = {r=255, g=255, b=0}
,YELLOW1 = {r=240, g=240, b=0}
,DARKGREY = {r=50, g=50, b=50}
,CYAN = {r=100, g=200, b=255}
,PANEL_BACK = {r=136, g=222, b=255}
,LABEL1 = {r=0, g=0, b=0}
,LABEL2 = {r=255, g=255, b=255}
}
cant be executed
Help!
The loadfile comment was only to explain the variability in times (because of SPIFFS write delays) that I discussed above.
On your specific code, I like your general coding style, though I do think that using setfenv() to make M the environment can be problematic. In particular, you _must_ cache all base Lua functions in locals as you lose access to them; see PiL 15.4. I personally prefer just using explicit assignment such as function M.PrintValues(data) and using strict.lua to pick up accidentally created globals.
One side effect of dofile() is that it uses the current thread's environment for the executed file rather than the calling function's, and hence global variables such as print are defined. Also note that your dispDrawSub.lua declaration of var_labels_text() should be a local function here. Not having it local here creates a copy in _G.
My problem in working with your code is that your issue is basically "my program doesn't work" rather than a Minimal, Complete, and Verifiable example as we ask for. (Please read this link.) It is hard to sort out the individual failures in our code from your own bugs. My #2751 is MCV.
Thank you for advises!
Egor, you obviously have other code loaded during your build because you use DO() instead of dofile(), etc. At the moment I suggest using host-side luac.cross and LFS, as a workaround. I want to focus on solving #2751 and this might also help you. Remember that the best way to us to prioritise you issue is to distill it down to a MCV test case. :smile:
Egor, I've just tried doing a node.compile('dispDraw.lua') and for the sub module with the sdk upgrade that I mentioned in #2751. I will close also this when we agree the PR on this.