Mbed-os: STM32 CAN attach RxIrq unusable.

Created on 24 Jan 2019 · 21Comments · Source: ARMmbed/mbed-os

Description of defect

In the CAN implementation for STM32 microcontrollers the CAN::read() function is responsible for clearing the flag that generates the Rx interrupt. This is because the flag is reset by a register that also controls the FIFO buffer. If the data is not pulled from the buffer before the flag is reset then the data will not be available.

The issue is that CAN::read() has uses a mutex which can't be used in an interrupt context.
Because you need to read the data before resetting the flag which is generating the interrupt and you cant read the data in an interrupt because of the mutex in the read function, it is impossible to use CAN with an interrupt.

As far as I can tell, this will affect all STM32 microcontrollers. It may also effect other platforms depending on their implementation.

Either the mutex needs to be removed or the data must be read and cached by a lower level ISR.

Target(s) affected by this defect ?

STM32

Toolchain(s) (name and version) displaying this defect ?

n/a

What version of Mbed-os are you using (tag or sha) ?

n/a

What version(s) of tools are you using. List all that apply (E.g. mbed-cli)

n/a

How is this defect reproduced ?

n/a

IOTOSM-2465 OPEN st mirrored bug

Source

TobinHall

Most helpful comment

Wow, just ran into this issue myself on a NUCLEO_F413ZH

I am surprised the history of this is so long, detailed, and unsolved! Mbed certainly needs a refactor of its CAN HAL API... It's workable for now but there are many improvements that could be made.

It would be nice if mbed would use buffering supported by HW (3 messages on STM platform) or maybe SW queue and split CAN as Serial - UnbufferedCAN and BufferedCAN

From what I can tell Mbed does use the 3 message FIFO on STM -- the RX FIFOs are managed entirely by hardware. The STM can_read implementation _does_, in theory, support reading from either 3-message FIFO (FIFO0 or FIFO1) based on the given handle argument. However, it is not currently possible for the application to enable storing messages in FIFO1 because of the way can_filter is implemented -- every filter is configured to place matching messages into FIFO0. This would be easy to fix but I think STM is using handle incorrectly in their driver.

My understanding of the intent behind the handle parameter in the can_api.h is that it is a unique identifier to a certain filter the user has previously configured. STM is using inconsistently in can_read and can_filter. I can't say I blame them too much: It seems the can_api.h file does not even have comments explaining what each function/argument is _supposed_ to do...

@kjbracey-arm @maclobdell I think it's about time the CAN API in Mbed gets some love. Maybe it could be worked onto the back end of the mysterious Mbed schedule? I would be willing to participate in defining an updated HAL spec.

AGlass0fMilk on 19 Jun 2020

👍3

All 21 comments

5267 is likely a related issue.

TobinHall on 24 Jan 2019

@ARMmbed/mbed-os-hal @ARMmbed/team-st-mcd

cmonr on 30 Jan 2019

@ARMmbed/team-st-mcd Could you please review the commit https://github.com/Flowm/mbed-os/commit/c8a37fff1fcb58a0007c257653bde6e96f98f4c6?

TacoGrandeTX on 11 Mar 2019

Unfortunately this commit is just a workaround, not a proper fix.

It simply disables the mutex in CAN::read() instead of fixing the underlying issue which was properly described by https://github.com/ARMmbed/mbed-os/issues/9495#issue-402858593.

Afterwards CAN communication works properly with can.attach(callback(), IrqType::RxIrq); and can.read() in the callback.

Flowm on 11 Mar 2019

You could subclass mbed::CAN and override lock() and unlock() to do nothing.

bikeNomad on 14 Dec 2019

https://github.com/pilotak/NoMutexCAN

pilotak on 14 Dec 2019

Hi
I advice you to push some pull request in order to get a full review by HAL experts.
regards,

jeromecoutant on 16 Dec 2019

@jeromecoutant is mutex really needed in CAN driver?

pilotak on 16 Dec 2019

@kjbracey-arm

jeromecoutant on 16 Dec 2019

I was recently discussing CAN and interrupts here: https://github.com/ARMmbed/mbed-os/issues/6714

I suggest reading through that, as it covers pretty much everything about interrupts and locks.

I was not aware of this issue at the time, and I'm not quite sure how to read it. Is it the case that the example code over in that issue won't work on STM? (Signal event in IRQ handler, do read from thread woken by that event?)

Locally overriding CAN to make an UnlockedCAN that can do a read from interrupt is valid for an application to do, but an application should not have to do that - an application that wakes a thread to do the read should be valid.

The HAL CAN API does seem to be woefully underspecified, so this is debatable.

I'm not familiar with CAN, or typical hardware implementations, but it seems to be close analog to the UART. There we let the HAL be "thin" for typical UARTs - we don't require there to be buffering in the HAL, but because the HW buffers are small we in practice do need data to be read from interrupt.

This is then covered by the separate classes RawSerial, which lets you do an interrupt read, and UARTSerial, which does the interrupt read for the application, and provides buffering.

We don't have such a split for CAN - maybe it is a bit like my nemesis Serial - something that has to read from interrupt but can't. Hopefully there's enough hardware buffering that doing read from a woken thread is sufficient.

kjbracey-arm on 16 Dec 2019

something that has to read from interrupt but can't

that's exactly where the problem is - with current implementation.

It would be nice if mbed would use buffering supported by HW (3 messages on STM platform) or maybe SW queue and split CAN as Serial - UnbufferedCAN and BufferedCAN

pilotak on 16 Dec 2019

Yes, it needs to do one or the other. If hardware normally has sufficient buffers, and you're saying STM does, then apps relying on buffering to do reads from threads should be fine.

The STM HAL should be fixed so that the example code here works, if it doesn't.

https://github.com/ARMmbed/mbed-os/issues/6714#issuecomment-558983743

kjbracey-arm on 16 Dec 2019

It is hard to believe, that this issue still exists and nobody offered a fix or some working solution except disablling and reenabling CAN IRQ flag.

Mixerito on 2 Jun 2020

Wow, just ran into this issue myself on a NUCLEO_F413ZH

I am surprised the history of this is so long, detailed, and unsolved! Mbed certainly needs a refactor of its CAN HAL API... It's workable for now but there are many improvements that could be made.

It would be nice if mbed would use buffering supported by HW (3 messages on STM platform) or maybe SW queue and split CAN as Serial - UnbufferedCAN and BufferedCAN

AGlass0fMilk on 19 Jun 2020

👍3

I have also noticed more unfinished or badly implemented CAN features. For example it is impossible to attach interrupt on Error IRQs. (EwIrq, EpIrq, BeIrq), because of always disabled Global ERRIE flag. Moreover LEC interrupt is not even defined. And IRQ on TX Overrun is also not possible to catch really. STM CAN really needs to be reimplemented from scratch.

Mixerito on 19 Jun 2020

I'm happy to test new CAN things

pilotak on 22 Jun 2020

@TobinHall thank you for raising this issue.Please take a look at the following comments:

Could you add some more detail to the description? A good description should be at least 25 words.
What target(s) are you using?
What toolchain(s) are you using?
What version of Mbed OS are you using (tag or sha)?
It would help if you could also specify the versions of any tools you are using?
How can we reproduce your issue?

NOTE: If there are fields which are not applicable then please just add 'n/a' or 'None'.This indicates to us that at least all the fields have been considered.
Please update the issue header with the missing information, the issue will not be mirroredto our internal defect tracking system or investigated until this has been fully resolved.

ciarmcom on 13 Oct 2020

We've updated our automation, I will fix the requirements .

adbridge on 15 Oct 2020

Thank you for raising this detailed GitHub issue. I am now notifying our internal issue triagers.
Internal Jira reference: https://jira.arm.com/browse/IOTOSM-2465

ciarmcom on 15 Oct 2020

Is there any update regarding to the STM's RX IRQ related issue?

opntr on 12 Nov 2020

I believe there won't be any fix for this problem (as it's linked with STM32 hardware and Mbed "one for all" library) except a buffered/unbuffered solution.

As Mutex prevent using read method in ISR context, but the only way that STM32 FDCAN have chosen to lower the IRQ flag is a reading. So, using attach method brings you to a deadlock.

Thus, this deadlock cannot be solved as it’s may be impossible to remove the mutex from Mbed lib because some others CAN controllers may have the use of it, and STM32 won’t change their hardware (and I don’t wish to have a flag that can be cleared while message is still unread).

The best solution at that time (that will remove any portability of Mbed code) is to remove the useless mutex for STM32 (because mutex is just useless with FDCAN, as, you just write or read in memory, it’s thread safe).

I know it’s not “clean” but it’s the only solution that ensure a valid CAN use of IRQ for STM32 at least before Mbed create a buffered class.