Esp-idf: [TW#27501] assertion "res == coreID || res == portMUX_FREE_VAL" failed: *.* function: vPortCPUAcquireMutexIntsDisabledInternal

Created on 20 Nov 2018  路  6Comments  路  Source: espressif/esp-idf

Hi,

I am working on a project that integrates wifi manager https://github.com/tonyp7/esp32-wifi-manager, frequency counter https://github.com/DavidAntliff/esp32-freqcount and mqtt (tcp) example in latest release.

freqeuency counter and overall project event handler tasks are pinned to core 1.

on make flash, all code runs fine and I am able to subscribe to topic and publish first time.

Cyclic logic in the mqtt event handler: Publish should happens only when a previously subscribed topic is received.

Issue: the moment a subscribed topic is received, system crashes with below error.

assertion "res == coreID || res == portMUX_FREE_VAL" failed: file "/home/acer/esp/esp-idf/components/freertos/portmux_impl.inc.h", line 105, function: vPortCPUAcquireMutexIntsDisabledInternal
abort() was called at PC 0x400d64e3 on core 0
0x400d64e3: __assert_func at /Users/ivan/e/newlib_xtensa-2.2.0-bin/newlib_xtensa-2.2.0/xtensa-esp32-elf/newlib/libc/stdlib/../../../.././newlib/libc/stdlib/assert.c:63 (discriminator 8)

Backtrace: 0x40090588:0x3ffce650 0x400907b5:0x3ffce670 0x400d64e3:0x3ffce690 0x4008de4b:0x3ffce6c0 0x4008f7ae:0x3ffce6e0 0x400d3905:0x3ffce700 0x400d51d1:0x3ffce720 0x400d542a:0x3ffce740 0x400d55c6:0x3ffce780 0x400d5778:0x3ffce7b0 0x4008d56d:0x3ffce7d0
0x40090588: invoke_abort at /home/acer/esp/esp-idf/components/esp32/panic.c:680

0x400907b5: abort at /home/acer/esp/esp-idf/components/esp32/panic.c:680

0x400d64e3: __assert_func at /Users/ivan/e/newlib_xtensa-2.2.0-bin/newlib_xtensa-2.2.0/xtensa-esp32-elf/newlib/libc/stdlib/../../../.././newlib/libc/stdlib/assert.c:63 (discriminator 8)

0x4008de4b: vPortCPUAcquireMutexIntsDisabledInternal at /home/acer/esp/esp-idf/components/freertos/tasks.c:3564
(inlined by) vPortCPUAcquireMutexIntsDisabled at /home/acer/esp/esp-idf/components/freertos/portmux_impl.h:98
(inlined by) vTaskEnterCritical at /home/acer/esp/esp-idf/components/freertos/tasks.c:4258

0x4008f7ae: xEventGroupClearBits at /home/acer/esp/esp-idf/components/freertos/event_groups.c:338

0x400d3905: mqtt_event_handler at /home/acer/esp/esp-idf./main.c:224

0x400d51d1: esp_mqtt_dispatch_event at /home/acer/esp/esp-idf/components/mqtt/esp-mqtt/mqtt_client.c:856

0x400d542a: deliver_publish at /home/acer/esp/esp-idf/components/mqtt/esp-mqtt/mqtt_client.c:856

0x400d55c6: mqtt_process_receive at /home/acer/esp/esp-idf/components/mqtt/esp-mqtt/mqtt_client.c:856

0x400d5778: esp_mqtt_task at /home/acer/esp/esp-idf/components/mqtt/esp-mqtt/mqtt_client.c:856

0x4008d56d: vPortTaskWrapper at /home/acer/esp/esp-idf/components/freertos/port.c:403

Please suggest how to solve this problem.

**Project environment:

IDF release 3.2
xtensa-esp32-elf-gcc (crosstool-NG crosstool-ng-1.22.0-80-g6c4433a) 5.2.0
HW: WROOM32**

Most helpful comment

Hi Deepak,

I am afraid no other idea than what have already been brought up before. Suspecting a data corruption of some kind which is always difficult to debug. when the crash occurs you can just print out the address and content of res (event flag ptr) maybe that could give a hint. Also possibly try to reconfigure to single core application to check if the issue persists (menuconfig - > Component config -> FreeRTOS -> Run only of first core)

Also looking at the code I noticed callbacks (probably from RMT) window_start_callback and frequency_callback. Are these from ISRs? If yes event flags should be set using xEventGroupSetBitsFromISR().
Another thing probably worth trying is to create a separate event flags group for mqtt event handler, which then could be synchronized with overhead_tank_event_group in a separate thread (won鈥檛 be a solution, just a pointer/workaround)

I wish I could help more...
Best Regards
David

All 6 comments

hi @NewStackLearner

What exactly does the 'cyclic logic' in event handler mean? Do you publish to the same topic as subscribed? This scenario doesn't make much sense, but as far as I can see it works as expected and keeps publishing. Could you post a snippet here please?

The assertion indicates either a corupt memory or uninitialized mutex. From the attached log it looks like the crash happened when event propagated from mqtt module to user handler while clearing the event flags. Is the event flag initialized?
No other idea than probably a memory issue. Please check your heap size and stack size. Are you able to see the failure also without the two projects you mention? Can you try to comment out creating those to see if the error still persists?

Thanks

@david-cermak

Appreciate your interaction on the topic.

What exactly does the 'cyclic logic' in event handler mean?

static esp_err_t mqtt_event_handler(esp_mqtt_event_handle_t event)
{
switch (event->event_id) {
case MQTT_EVENT_CONNECTED:
//subscribe to topic1
break;

case MQTT_EVENT_SUBSCRIBED:
//publish topic 2
break;

case MQTT_EVENT_PUBLISHED:

//Wait for 5 seconds and then event in SensorStateChangeEventGroup;
//publish topic2 if event to publish
break;

case MQTT_EVENT_DATA:
//publish topic2 if current received data subscribed on topic1 is different from previous saved state
break;
}

Based on your feedback, i increased mqtt stack size to 32K. Also moved mqtt task to same core (Core 1) which is running user event handler. And, brought task priority of mqtt from 15 to 5, same as http. Same as before: user event handler task runs on priority 1 and frequency count task handler runs on priority 2

With this change, see the failure is less frequent and persists randomly within first 10 seconds of mqtt connection

i still see an infrequent debug message:

dhcps: send_offer>>udp_sendto result 0

and even less frequent

dhcps: send_offer>>udp_sendto result 0
dhcps: send_nak>>udp_sendto result 0
dhcps: send_offer>>udp_sendto result 0

I am also kind of lost with respect to task priorities. I noticed that if I am browsing web pages served by http, failure happens much faster.

Please check your heap size and stack size.

I could not find how to change heap size, please help.

Are you able to see the failure also without the two projects you mention?

No, failure occurs only when mqtt task is integrated in the wifi manager

Can you try to comment out creating those to see if the error still persists?

Issue not seen without MQTT task.

Ran out of ides to tryout further. Any suggestion are very welcome.

Many thanks!
Deepak

Hi Deepak,

Thanks for sharing the code. What struck me as odd is the

//Wait for 5 seconds and then event in SensorStateChangeEventGroup;

What kind of wait you use? (assuming vTaskDelay) The event handler runs from the MQTT thread context so it actually blocks all mqtt message reception/transmission.
If that's the case I would suggest another condition variable so the waiting takes place in a different thread (but that wouldn't explain crashes -- just some misbehaving transmission on mqtt side)

Wrtt dhcps debug logs: Don't you use some older IDF? (with the latest master, these lwip debug messages are all suppressed -- and I don't think it's even configurable to enable those)

Your issue is most likely some memory corruption. I cannot much help with that, but at least try to give some pointers (don't think changing task priority nor task pin to any core would help). IDF tools generate automatically ld script (https://docs.espressif.com/projects/esp-idf/en/latest/api-guides/linker-script-generation.html?highlight=linker).

Hopefully this helps, David

Hi David,

Thanks for the pointers.

I made updates in the code for task wait based on event groups instead of vTaskDelay. Issues still persists as before.
I noticed that the code runs fine until any event causes a publish event.

I am suspected frequency counter and mqtt tasks need to have some event locks. But, issue remained as the lock was also implemented in the respective event handler.

Attaching the main code file. Any pointer to what am i missing will help.

smartconfig_main.txt

Many thanks in advance for any help extended.

Regards
Deepak

Hi Deepak,

I am afraid no other idea than what have already been brought up before. Suspecting a data corruption of some kind which is always difficult to debug. when the crash occurs you can just print out the address and content of res (event flag ptr) maybe that could give a hint. Also possibly try to reconfigure to single core application to check if the issue persists (menuconfig - > Component config -> FreeRTOS -> Run only of first core)

Also looking at the code I noticed callbacks (probably from RMT) window_start_callback and frequency_callback. Are these from ISRs? If yes event flags should be set using xEventGroupSetBitsFromISR().
Another thing probably worth trying is to create a separate event flags group for mqtt event handler, which then could be synchronized with overhead_tank_event_group in a separate thread (won鈥檛 be a solution, just a pointer/workaround)

I wish I could help more...
Best Regards
David

Hi David,

Thanks for all the help.

I repositioned event groups and removed them completely from the interrupt calls.

This solved the problem.

Many thanks once again!

Was this page helpful?
0 / 5 - 0 ratings