Esp-idf: Crash on OTA Update (IDFGH-1848)

Created on 14 Jan 2017  Â·  22Comments  Â·  Source: espressif/esp-idf

I'm trying to use the 25_ota example. Unfortunately the applications crashes right after making the http request to the webserver.

So here is the last uart output from the esp32:

I (7404) ota: Connect to Wifi ! Start to Connect to Server....
I (7404) ota: Server IP: 10.0.255.105 Server Port:8070
I (7414) ota: Connected to server
I (7414) ota: Connected to http server
I (7414) ota: Send GET request to server succeeded
I (7424) ota: esp_ota_begin init OK
I (7424) ota: OTA Init succeeded
I (7424) ota: OTA Write Header format Check OK. first byte is e9 ,second byte is 09
abort() was called at PC 0x400d2476
Guru Meditation Error: Core  1 panic'ed (abort)

Backtrace: 0x40084311:0x3ffce040 0x40083f94:0x3ffce060 0x40082451:0x3ffce080 0x40081cdc:0x3ffce0a0 0x4010235e:0x3ffce0f0 0x401070b6:0x3ffce110 0x40105e8e:0x3ffce130 0x401062b4:0x3ffce160

And here is the core dump:

python2 ~/esp/esp-idf/components/espcoredump/espcoredump.py -p /dev/ttyUSB0 --chip esp32 info_corefile  build/ota.elf
espcoredump.py v0.1-dev
esptool.py v2.0-dev
Connecting...
Uploading stub...
Running stub...
Stub running...
16 (100 %)
16 (100 %)
Read 16 bytes at 0x110000 in 0.0 seconds (14.2 kbit/s)...
Hard resetting...

esptool.py v2.0-dev
Connecting...
Uploading stub...
Running stub...
Stub running...
7396 (100 %)
7396 (100 %)
Read 7396 bytes at 0x110000 in 0.7 seconds (86.5 kbit/s)...
Hard resetting...

===============================================================
==================== ESP32 CORE DUMP START ====================

================== CURRENT THREAD REGISTERS ===================
pc             0x40084311   0x40084311 <xTaskResumeAll+393>
lbeg           0x0  0
lend           0x0  0
lcount         0x0  0
sar            0x5  5
ps             0x60d26  396582
threadptr      <unavailable>
br             <unavailable>
scompare1      <unavailable>
acclo          <unavailable>
acchi          <unavailable>
m0             <unavailable>
m1             <unavailable>
m2             <unavailable>
m3             <unavailable>
expstate       <unavailable>
f64r_lo        <unavailable>
f64r_hi        <unavailable>
f64s           <unavailable>
fcr            <unavailable>
fsr            <unavailable>
a0             0x40083f94   1074282388
a1             0x3ffcdff0   1073537008
a2             0x0  0
a3             0x0  0
a4             0x3ffb9088   1073451144
a5             0x0  0
a6             0x0  0
a7             0x1  1
a8             0x3ffb2454   1073423444
a9             0x1  1
a10            0x24 36
a11            0x24 36
a12            0x8  8
a13            0xffffffff   -1
a14            0x0  0
a15            0xffffffff   -1

==================== CURRENT THREAD STACK =====================
#0  xTaskResumeAll () at /home/torben/esp/esp-idf/components/freertos/./tasks.c:2204
#1  0x40083f94 in prvIdleTask (pvParameters=0x1) at /home/torben/esp/esp-idf/components/freertos/./tasks.c:3305
#2  0x40082451 in spi_flash_enable_interrupts_caches_and_other_cpu () at /home/torben/esp/esp-idf/components/spi_flash/./cache_utils.c:141
#3  0x40081d1d in spi_flash_guard_end () at /home/torben/esp/esp-idf/components/spi_flash/./flash_ops.c:115
#4  spi_flash_write (dst=1180485, srcv=0x3ffb6318 <ota_write_data>, size=<optimized out>) at /home/torben/esp/esp-idf/components/spi_flash/./flash_ops.c:228
#5  0x4010235e in esp_partition_write (partition=0x3ffb9fb4, dst_offset=<optimized out>, src=0x3ffb6318 <ota_write_data>, size=1024) at /home/torben/esp/esp-idf/components/spi_flash/./partition.c:247
#6  0x401070b6 in esp_ota_write (handle=<optimized out>, data=0x3ffb6318 <ota_write_data>, size=1024) at /home/torben/esp/esp-idf/components/app_update/./esp_ota_ops.c:122
#7  0x401062d9 in main_task (pvParameter=<optimized out>) at /home/torben/esp/esp-idf/examples/25_ota/main/./ota.c:283

======================== THREADS INFO =========================
  Id   Target Id         Frame 
  11   process 10        0x40082dfb in xQueueGenericReceive (xQueue=0x3ffb34c0, pvBuffer=0x3f403fd4, xTicksToWait=<unavailable>, xJustPeeking=0) at /home/torben/esp/esp-idf/components/freertos/./queue.c:1594
  10   process 9         0x40082dfb in xQueueGenericReceive (xQueue=0x3ffb195c <g_intr_lock_mux>, pvBuffer=0x3ffb7228 <xphyQueue>, xTicksToWait=<unavailable>, xJustPeeking=0) at /home/torben/esp/esp-idf/components/freertos/./queue.c:1594
  9    process 8         0x40082dfb in xQueueGenericReceive (xQueue=0x1, pvBuffer=0x400822f8 <spi_flash_op_block_func>, xTicksToWait=<unavailable>, xJustPeeking=0) at /home/torben/esp/esp-idf/components/freertos/./queue.c:1594
  8    process 7         0x40082dfb in xQueueGenericReceive (xQueue=0x0, pvBuffer=0x400822f8 <spi_flash_op_block_func>, xTicksToWait=<unavailable>, xJustPeeking=0) at /home/torben/esp/esp-idf/components/freertos/./queue.c:1594
  7    process 6         0x40082dfb in xQueueGenericReceive (xQueue=0x0, pvBuffer=0x0, xTicksToWait=<unavailable>, xJustPeeking=0) at /home/torben/esp/esp-idf/components/freertos/./queue.c:1594
  6    process 5         0x40082dfb in xQueueGenericReceive (xQueue=0x3ffb682c, pvBuffer=0x3ffb6789, xTicksToWait=<unavailable>, xJustPeeking=0) at /home/torben/esp/esp-idf/components/freertos/./queue.c:1594
  5    process 4         0x40084c8e in prvProcessTimerOrBlockTask (xNextExpireTime=<optimized out>, xListWasEmpty=<optimized out>) at /home/torben/esp/esp-idf/components/freertos/./timers.c:487
  4    process 3         0x40082dfb in xQueueGenericReceive (xQueue=0x3ffb567c <mbox>, pvBuffer=0x3ffbc530, xTicksToWait=2148464477, xJustPeeking=0) at /home/torben/esp/esp-idf/components/freertos/./queue.c:1594
  3    process 2         0x400d1dc8 in esp_vApplicationIdleHook () at /home/torben/esp/esp-idf/components/esp32/./freertos_hooks.c:52
  2    process 1         0x400d1dc8 in esp_vApplicationIdleHook () at /home/torben/esp/esp-idf/components/esp32/./freertos_hooks.c:52
* 1    <main task>       xTaskResumeAll () at /home/torben/esp/esp-idf/components/freertos/./tasks.c:2204

======================= ALL MEMORY REGIONS ========================
Name   Address   Size   Attrs
.rtc.text 0x400c0000 0x0 RW  
.iram0.vectors 0x40080000 0x400 R XA
.iram0.text 0x40080400 0x1c644 R XA
.dram0.data 0x3ffb0000 0x2428 RW A
.flash.rodata 0x3f400010 0x9450 RW A
.flash.text 0x400d0018 0x3bc64 R XA
.coredump.tasks 0x3ffce174 0x164 RW 
.coredump.tasks 0x3ffcdf30 0x238 RW 
.coredump.tasks 0x3ffbacc8 0x164 RW 
.coredump.tasks 0x3ffb9900 0x168 RW 
.coredump.tasks 0x3ffbb37c 0x164 RW 
.coredump.tasks 0x3ffbb210 0x160 RW 
.coredump.tasks 0x3ffbc5d4 0x164 RW 
.coredump.tasks 0x3ffbc480 0x148 RW 
.coredump.tasks 0x3ffbbb24 0x164 RW 
.coredump.tasks 0x3ffbba30 0xe8 RW 
.coredump.tasks 0x3ffbe15c 0x164 RW 
.coredump.tasks 0x3ffbe060 0xf0 RW 
.coredump.tasks 0x3ffbd678 0x164 RW 
.coredump.tasks 0x3ffbd550 0x11c RW 
.coredump.tasks 0x3ffb87cc 0x164 RW 
.coredump.tasks 0x3ffb86e0 0xe0 RW 
.coredump.tasks 0x3ffb8ed4 0x164 RW 
.coredump.tasks 0x3ffb8de0 0xe8 RW 
.coredump.tasks 0x3ffcb04c 0x164 RW 
.coredump.tasks 0x3ffcaf40 0x100 RW 
.coredump.tasks 0x3ffcbaa8 0x164 RW 
.coredump.tasks 0x3ffcb9a0 0xfc RW 

===================== ESP32 CORE DUMP END =====================
===============================================================
Done!

To me it seems like there is a problem trying to write to the flash.

Most helpful comment

@chegewara and @Scalpel78:

For errors like this:

E (5554) esp_ota_ops: OTA image has invalid magic byte (expected 0xE9, saw 0x3c

0x3c is the ASCII code for <, and suggests that the response payload is HTML not an app binary file. This usually happens if there is a HTTP server error.

Unfortunately the OTA example is only a poor imitation of an HTTP client which sends a very basic HTTP request and doesn't check for an error in the result. We are working on an HTTP/HTTPS client library which will provide a much more useful (and safe) abstraction for HTTP connections.

Also, @chegewara has determined that the OTA example currently doesn't send a valid HTTP/1.1 request at all. This will be updated to a valid HTTP/1.0 request, same as in the HTTP & HTTPS examples.

To diagnose problems with the OTA example:

  • Look at the server logs for error responses to the HTTP request.
  • Log the header response after the request and look for error messages, etc.

The proper HTTP/HTTPS client library should be available soon, and this example will be updated to use it.

All 22 comments

I have a similar problem but on something else.

233

I had the same issue but helped chosing [*] Run FreeRTOS only on first core option in the project config.
In addition, I must modify the checking header code in resolve_pkg function because my hello_world.bin file starts with 0xE9 0x08 instead of 0xE9 0x09.

if (ota_write_data[0] == 0xE9 && i_write_len >= 2 && (ota_write_data[1] == 0x08 || ota_write_data[1] == 0x09)) {
    ESP_LOGI(TAG, "OTA Write Header format Check OK. first byte is %02x ,second byte is %02x", ota_write_data[0], ota_write_data[1]);
} else {
    ESP_LOGE(TAG, "OTA Write Header format Check Failed! first byte is %02x ,second byte is %02x", ota_write_data[0], ota_write_data[1]);
    return false;
}

@konrad282 you are right! there is a mini-error in header check ,we should not check second bytes which present that segment counts.

@Geend @michal037 as @konrad282 mentioned, we should change FreeRTOS work on first core at make menuconfig,while there are some control-logic errors when working on dual core.

So this is a bug? And it will be fixed?

I am trying to use esp_ota_begin from arduino core ... i guess it won't work there as i can't force FreeRTOS to run on first core only ? or is there a way ? (sorry i know its not the right forum, but it is somehow related, and fixing the issue with running this on multiple cores should solve my problem as well)

Hi, this problem should have been solved now, isn't it?

yup, seems to work well with the latest esp-idf

I'm still getting error with the latest esp-idf.

It fails with the error:
I (4984) ota: Connected to http server
I (4984) ota: Send GET request to server succeeded
I (4994) ota: Writing to partition subtype 16 at offset 0x110000
I (5554) ota: esp_ota_begin succeeded
E (5554) esp_ota_ops: OTA image has invalid magic byte (expected 0xE9, saw 0x3c
E (5554) ota: Error: esp_ota_write failed! err=0x1503
I (5554) ota: Connection closed, all packets received
I (5564) ota: Total Write binary data length : 0
E (5564) ota: esp_ota_end failed!
E (5574) ota: Exiting task due to fatal error...

Error seems to originate at https://github.com/espressif/esp-idf/blob/9050307dfe600c915adaccf8076220d6474876db/components/app_update/esp_ota_ops.c#L148

Are you in the build directory or the root directory? I got this error if I wasn't in the build directory.
cd /path/to/project/build python -m SimpleHTTPServer 8070

@projectgus Hi
Im having the same error (https://github.com/espressif/esp-idf/issues/231#issuecomment-300287523) when im trying to update from external server, from internet. File is accessible and is proper compiled cause from lan app can read it and write to ota partition (veeeery slow but write it).

EDIT: not very slow, it just stop after 11KB write, and ota image is hello-world example. It stall for 5 minuts now, after last log line:

I (4569) ota: esp_ota_begin succeeded
I (4570) ota: esp_ota_write header OK
I (4570) wifi: pm start, type:0

I (4571) ota: Have written image length 1250
I (6743) ota: Have written image length 2274
I (6747) ota: Have written image length 2686
I (6758) ota: Have written image length 3710
I (6760) ota: Have written image length 4122
I (56780) ota: Have written image length 5146
I (56781) ota: Have written image length 5541
I (56792) ota: Have written image length 6565
I (56792) ota: Have written image length 6977
I (110138) ota: Have written image length 8001
I (110139) ota: Have written image length 8413
I (110148) ota: Have written image length 9437
I (110151) ota: Have written image length 10461
I (110152) ota: Have written image length 11285

@chegewara and @Scalpel78:

For errors like this:

E (5554) esp_ota_ops: OTA image has invalid magic byte (expected 0xE9, saw 0x3c

0x3c is the ASCII code for <, and suggests that the response payload is HTML not an app binary file. This usually happens if there is a HTTP server error.

Unfortunately the OTA example is only a poor imitation of an HTTP client which sends a very basic HTTP request and doesn't check for an error in the result. We are working on an HTTP/HTTPS client library which will provide a much more useful (and safe) abstraction for HTTP connections.

Also, @chegewara has determined that the OTA example currently doesn't send a valid HTTP/1.1 request at all. This will be updated to a valid HTTP/1.0 request, same as in the HTTP & HTTPS examples.

To diagnose problems with the OTA example:

  • Look at the server logs for error responses to the HTTP request.
  • Log the header response after the request and look for error messages, etc.

The proper HTTP/HTTPS client library should be available soon, and this example will be updated to use it.

Just found the issue in my ota app. I have set ota file path this way and it works from linux ubuntu 14 server from vbox on local machine (laptop), but it causes mentioned error if i try to ota flash from home server.
#define EXAMPLE_FILENAME "ota.bin"
But if i change file path to this one then i can ota flash from home server:
#define EXAMPLE_FILENAME "/ota.bin"

I have the same problem.
I do not have much knowledge about HTTP connections.

These are my tests:

From Internet :
No problem - Works fine OTA OK
downloads correctly the image.bin and upgrade the firmware

From Android:
SimpleHTTpServer _- ubi.jp -_ Works fine OTA OK
HFS HTTP File Server 1.0 _- vortexinfinitum -_ Works fine OTA OK
Http Server _– streamingnology inc -_ Not works OTA error
Tiny Web Server _– Leonardo Javier Russo -_ Not works OTA error
HTTP Server powered by Apache _- Tautvydas Andrikys -_ Not works OTA error

IOS:
AirDisk _-XUEHUI wu -_ Not works OTA error
Files – File Manager & Browser _– Hian Zin Jong -_ Works fine OTA OK


Now I would like to build a custom app to perform OTA, but I would like to know what is the reason why a server works and another server does not.

Hi francescofcf,

Apache2 default conf. is keepalive=On.
It could cause trouble.
Simply turn it off on the server. (It has solved my issues with OTA.)
I hope this info could help you.

@projectgus

Just checking if ESP-IDF v3.3 implements an "error check" in the result of an HTTP request.
An OTA update sometimes doesn't get all the binary data length.

If it does implement an "error check" I will do a more robust debug on why sometimes it does downloads full bin, and why sometimes it doesn't.

If it doesn't yet, I will be implementing a data length check before, and retry (by MQTT 1. Run OTA 2. URL 3. bin_size) as a work around.

Thanks.

Unfortunately the OTA example is only a poor imitation of an HTTP client which sends a very basic HTTP request and doesn't check for an error in the result. We are working on an HTTP/HTTPS client library which will provide a much more useful (and safe) abstraction for HTTP connections.

The proper HTTP/HTTPS client library should be available soon, and this example will be updated to use it.

@javicordon Check native_ota_example, its pretty nice and easy to use component:
https://github.com/espressif/esp-idf/tree/master/examples/system/ota/native_ota_example

@chegewara Indeed, it is the one I am currently using.
The example works fine as described, local and over Internet Server. My question is orientated to the HTTP request used for OTA. (i.e. It doesn't seem to be persistent in case of Response Timeout or Keep-alive Timeout, or error check.)

When you get a bin from a web server esp_http_client_fetch_headers(client) returns the size of the bin file, or -1 if any error. If there is a keep-alive timeout as it was happening to me (in the middle of reading data from http stream), then connection is closed anyway and continues with the OTA. It doesn't verifies if the data data read is the same size as returned from the http header.

Of course it shows the errors E (19325) esp-tls: read error :-76: E (19325) TRANS_SSL: esp_tls_conn_read error, errno=No more processes followed by E (26301) esp_image: invalid segment length 0xffffffff E (26301) native_ota_example: esp_ota_end failed! E (26301) native_ota_example: Exiting task due to fatal error...

Knowing this may happen, it can be improved and get a safer version of the library.

We are working on an HTTP/HTTPS client library which will provide a much more useful (and safe) abstraction for HTTP connections.

As for now, I modified the Keep-alive parameters on the server, and double checking the OTA to be written ("downloaded") has the same size as reported in http header.

Thanks for the answer and suggestion tho.

Then maybe its better to open new issue as this was created before native ota has been implemented and is about old http ota update.

@javicordon Yes, I agree that checking the binary length matches the HTTP header is a desirable feature for the HTTP client library (which means it can be used by OTA library and other consumers of the HTTP client library).

Please open this as a feature request in a new issue, as @chegewara suggests, so we can keep track.

@projectgus FYI in case this helps anyone. I was having this same issue:

E (5554) esp_ota_ops: OTA image has invalid magic byte (expected 0xE9, saw 0x3c

with stable version v3.3 while trying to download a firmware binary stored in a public github repo. Everything worked fine with the latest version (v4 developmental).

The problem I had while using v3.3 was that github was redirecting me from the location I was accessing the file in my browser when I right-clicked the download button and copied the download linkhttps://github.com/<username>/<publicrepo>/raw/master/<filename>.bin to another url to download the binary file https://raw.githubusercontent.com/<username><publicrepo>/master/<filename>.bin

I found the location I was being redirected to by opening command prompt, and typing
curl -v <https://github.com/<username>/<publicrepo>/raw/master/<filename>.bin>

The last line of the response was

<html><body>You are being <a href="https://raw.githubusercontent.com/<username>/<publicrepo>/master/<filename>.bin">redirected</a>.</body></html>* Connection #0 to host github.com left intact

Hence the 0x3c (ascii '<') in the html response. I entered this new url into my program, but got this error

E (6804) TRANS_SSL: mbedtls_ssl_handshake returned -0x2700

It turns out that raw.githubusercontent.com has a different certificate than github.com. I generated the new .pem certificate file for raw.githubusercontent.com using fleaplus's method from step 3 of this readme. https://github.com/RadialDevGroup/esp32-ota-https-example

Everything works fine now.

Posting this in here in case this helps anyone trying to do something similar.

Wrong .bin file permissions on a Linux machine running Apache web server can cause this error on an OTA update. Check your Apache web server error.log file to see if this is being caused by a permissions error. I needed to give read permission for the .bin file and it solved this error.

Was this page helpful?
0 / 5 - 0 ratings