Arduino: OTA update fails with XMC Flash chip

Created on 4 May 2020  Â·  18Comments  Â·  Source: esp8266/Arduino

OTA update fails (used gz Tasmota firmware) when a XMC Flash is on the ESP board.
Module Model: ESP-12F Vendor: DOITING
Flash Chip Id 0x164020

To get it working (again) a serial flash is needed

 ets Jan  8 2013,rst cause:2, boot mode:(3,6)

load 0x4010f000, len 3656, room 16

tail 8

chksum 0x0c

csum 0x0c

v9c56ed1f

Basic Infos

  • [x] This issue complies with the issue POLICY doc.
  • [x] I have read the documentation at readthedocs and the issue is not addressed there.
  • [x] I have tested that the issue is present in current master branch (aka latest git).
  • [x] I have searched the issue tracker for a similar issue.
  • [x] If there is a stack dump, I have decoded it.
  • [x] I have filled out all fields below.

Platform

  • Hardware: [ESP-12F]
  • Core Version: [2.7.0]
  • Development Env: [Platformio]
  • Operating System: [Windows|Ubuntu]

Settings in IDE

  • Module: [Generic ESP8266]
  • Flash Mode: [DOUT]
  • Flash Size: [4MB]
  • lwip Variant: [LWIP2_HIGHER_BANDWIDTH_LOW_FLASH]
  • Reset Method: [nodemcu]
  • Flash Frequency: [40Mhz]
  • CPU Frequency: [80Mhz]
  • Upload Using: [OTA]
  • Upload Speed: [115200] (serial upload only)

Problem Description

Flashing via serial does work and device works as expected.
Trying a OTA update results in this

Debug Messages

 ets Jan  8 2013,rst cause:2, boot mode:(3,7)

load 0x00000000, len 201326592, room 16


core bug

All 18 comments

Did some test #6725 causes the issue. Using a eboot.elf version before this commit solves the problem

Thanks @Jason2866 for the bissect !
@ChocolateFrogsNuts Does it ring a bell ?

Same problem for me on OTA only

Start OK
`

21:38:58.882 -> ets Jan 8 2013,rst cause:2, boot mode:(3,6)
21:38:58.882 ->
21:38:58.882 -> load 0x4010f000, len 3656, room 16
21:38:58.882 -> tail 8
21:38:58.882 -> chksum 0x0c
21:38:58.882 -> csum 0x0c
21:38:58.882 -> v9c56ed1f
21:38:58.882 -> ~ld

Don't start after update

21:33:21.235 -> ets Jan 8 2013,rst cause:2, boot mode:(3,6)
21:33:21.235 ->
21:33:21.235 -> load 0x4010f000, len 3656, room 16
21:33:21.235 -> tail 8
21:33:21.235 -> chksum 0x0c
21:33:21.235 -> csum 0x0c
21:33:21.235 -> v9c56ed1f
21:33:21.235 -> @cp:0
21:33:25.407 -> ld
21:33:25.407 -> e:
21:33:25.407 -> ets Jan 8 2013,rst cause:3, boot mode:(3,6)
21:33:25.407 ->
21:33:25.407 -> ets_main.c n l o{⸮⸮⸮g⸮s⸮⸮o ⸮⸮쒌⸮⸮⸮⸮d⸮⸮o⸮2
`

I can upgrade firmware with serial COM only on 2.7.0

On version 2.6.3, all is ok, serial and OTA

Hmm, ok, testing of OTA updates was shall we say "severely limited".
Easiest way to verify if it's the XMC bootloader code is to comment out the #define XMC_SUPPORT at line 193 of eboot.c and see what that does.

There are two factors at play here -
The implementation of spi_flash_get_id() on eboot may not work (ie I'm pretty sure its dodgy), thus not detecting the XMC chip... this should only be a factor if power is cycled when an update is not complete, or if you do an OTA upgrade from a version of the core that has no XMC support at all, because the XMC chip won't be set to full output drive.
Failing to detect the XMC chip should make no difference to how it operated on the previous version, so I think it's something else happening.

OR it is working and the code that tries to slow down the access is messing up.

Either way, trying a build with XMC support disabled in the bootloader will give some insight.

To whoever will test, the previous means:

  • modify eboot.c as explained
  • rebuild eboot.elf
  • rebuild your test sketch

Using a eboot.elf before the XMC PR works for my module with XMC flash (OTA with gzip and uncompressed too).
I just replaced the eboot.elf file. So IMHO for a Hotfix the PR should reverted
For building Tasmota we are using just this change. https://github.com/arendst/Tasmota/pull/8342

@ChocolateFrogsNuts The OTA fails only if the XMC eboot.elf is used.
This is not the case

if you do an OTA upgrade from a version of the core that has no XMC support at all, because the XMC chip won't be set to full output drive.

I am not using Linux. So i cant use the provided Makefile to build eboot.elf...
If you provide a modified version i will test.

All of my D1 Mini boards have XMC flash chips, if you need another test bunny. I'm not on Linux, either. :smile:

I know I can OTA to the IP address directly, haven't been able to OTA via the GUI for some time; Bonjour Browser doesn't see it, nor does Service Browser on the Android phone.

I'm pretty sure I updated to 2.7.0 dev on Saturday.

edit: yep, I'm even with master, and it dies quietly after the OTA completes when going from the command line direct to the IP address with espota.py. It uploads, but the code doesn't run after the reset.

Tested. XMC Hotfix solves issue. OTA works again

PR #7277 didn't fix it for me, even after doing "Erase all flash contents" with a fresh compile via serial upload with the new binary. I can still upload via serial, but OTA to the IP address hangs quietly on boot afterwards, and all I get is

ets Jan 8 2013,rst cause:2, boot mode:(3,0)
load 0x00000000, len 201326592, room 16

I'm purely guessing that doing git fetch upstream and git rebase upstream/master should have pulled the replacement eboot.elf file from #7277 after it was merged. It has a time stamp from 30 minutes ago, so it ought to be the right file. Here's the MD5 I got on the file from the merged PR:
d8708dea12bce1eb9ddf058020699fae *./eboot.elf

If I have to delete my local files and fork and start from scratch, I'm OK with that. ;-) Right now it doesn't appear like it'll help. I copied an older eboot.elf from my desktop PC with a date of 4/18/2020 on top of the one on my laptop and it uploads and works via OTA now.

edit: Never mind. After deleting everything and re-forking I was getting the older file from a different PR once I switched to it.

getting this issue some CPU cycles now... I can replicate it, so will be working on a fix over the next few days I hope.

Ok, so it appears that examining the flash speed registers to get appropriate values to use in eboot by printing them from within the sketch at various flash speed settings was the wrong approach - I needed to have eboot print them.
If I had done this initially, I would have realised that eboot runs at 20mhz flash speed every time - the registers never change until some time later during SDK init.
There are also some differences in the register values since last time I was looking at this... so that is probably why things are crashing.

Anyway, the fact that eboot already runs at 20mhz flash access no matter what speed is selected once the SDK/core starts means there is no need to have XMC support in eboot at all! The XMC chips only need special treatment (boosting the drive level) when at 40mhz and above, which never happens in eboot.

I'll do a bit more testing and try breaking things with a power cycle during the eboot copy, but I can't see why that would be a problem now.

I can also move spi_vendors.h back out of eboot into the core too.
Should get that pushed in the next day or so.

I added a couple of 2 second delays to eboot so I could pull the USB cable at a known point.
Power fail testing results:

  • Power failure in eboot before the copy begins: boots with old firmware.
  • Power failure after the copy completes but before command_clear(): boots with new firmware.
  • Power failure during copy: board consistently fails to boot with assorted errors/resets.

It would seem the RTC is losing the copy command when the power fails, thus resulting in no attempt to re-start the failed copy at power up!
This could be an issue for all boards, not just those with XMC flash... I will investigate further when I track down one of my non-XMC boards...

RTC is a SRAM, not an NVRAM, so that's expected, @ChocolateFrogsNuts. See #6538

ahh excellent... I can ignore that and just temporarily hack eboot to always copy while I test that the copy works ok from a cold boot.

@ChocolateFrogsNuts #7307 has some additional info.

PR #7317 should wrap this up.

@ChocolateFrogsNuts tried PR #7317 OTA is working with.

The relevant PRs are merged, so closing as already resolved.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

mechanic98 picture mechanic98  Â·  3Comments

pablotix20 picture pablotix20  Â·  3Comments

mreschka picture mreschka  Â·  3Comments

Khorne13 picture Khorne13  Â·  3Comments

eliabieri picture eliabieri  Â·  3Comments