Mbed-os: STM32F767ZI CAN object instantiation fail if bus is active.

Created on 1 Mar 2017 · 22Comments · Source: ARMmbed/mbed-os

Description

Type: Bug
Priority: Blocker

Bug

Target
nucleo_f767zi

Toolchain:
GCC_ARM

mbed-cli version:
1.0.0

meed-os sha:
bcf7085

Expected behavior
STM32F767ZI CAN objects can be instantiated regardless of CAN bus usage.

Actual behavior
CAN objects will be instantiated if the CAN bus is not currently in use. However, if there is another node on the CAN bus that is broadcasting at the time of instantiation, then the CAN bus instantiation hangs until the bus is no longer in use.

Steps to reproduce
To reproduce, have a node broadcast on a CAN bus shared with the STM32F767ZI (I did this at roughly 100ms intervals for my application). Try instantiating a CAN object using the pins connected to the bus. The instantiation should hang until the other node stops broadcasting or is disconnected from the bus, and then the code should continue executing per usual. If there is no CAN bus traffic, then there shouldn't be an issue (this was tested with the LPC1768 and produced the expected behaviour).

Below is a simpler version of the code I used to test this:

#include "mbed.h"

Ticker ticker;
DigitalOut led1(LED1);
DigitalOut led2(LED2);
CAN* can2_ptr;//(PD_0, PD_1);

Serial pc(USBTX,USBRX);

int main()
{
    pc.baud(115200);
    pc.printf("main()\r\n");
    //CAN can2(p9, p10); //for LPC1768
    CAN can2(PD_0, PD_1); //for STM32F767ZI
    pc.printf("created can\r\n");
    can2_ptr = &can2;
    can2.frequency(500000);
    pc.printf("set frequency \r\n");
    CANMessage msg;

    while(1) {
        if(can2.read(msg)) {
            pc.printf("<-*******Message received:id=%x,data=",msg.id);
            for(int i=0; i<8; i++) {
                pc.printf("%x,",msg.data[i] );
            }
            pc.printf("\n\r");
            led2 = !led2;
        }
    }
}

This code should not print the "created CAN" line.

st bug

Source

bperry730

All 22 comments

cc @bcostm @adustm @LMESTM @jeromecoutant

0xc0170 on 2 Mar 2017

Hello @bperry730
Your question is not clear to me.
How do you have NUCLEO_F767 CAN pins attached to the bus?

I can see that usually ST is having a CAN transceiver between the pins and the bus in the evaluation platforms that provide CAN example interface.

During CAN object instantiation the CAN hw module for STM32F767 is entering an init status. During this initialization sequence the bus read and transmit are deactivated.
(cf reference manual from STM32F767xx :

While in Initialization Mode, all message transfers to and from the CAN bus are stopped and
the status of the CAN bus output CANTX is recessive (high).
)

In mbed os for STM32F767xx, CAN init is doing :

initialize the pins
initialize STM32F7 CAN HW module
set CAN frequency
set CAN filters

I don't think that we should block the initialisation pending on the bus status.

adustm on 6 Mar 2017

Hallo @adustm

I have the same problem with an older mbed version and another target (NUCLEO_F446RE). The configuration is the following:

both boards have a CAN tranceiver
one board sends CAN messages
the other board receives CAN message
when the sending board is started first, the receiving board stops
when the sending board is reseted, the receiving board continues with initialization
now everything works as aspected
when the receiving board is started first, everything works as aspected

ohagendorf on 6 Mar 2017

Hello @adustm

I think your question has to do with the wiring. I didn't see any CAN transceiver on the Nucleo board, so the CAN1 (PD0, PD1) is connected to a CAN transceiver that is connected to the bus. On this bus, I have another CAN node (with transceiver) that I can either start broadcasting or have it broadcasting after I start-up the STM32F767xx.

When the other CAN node is broadcasting at the STM32F767xx start-up, it hangs at the CAN object instantiation. When I turn off the other node, disconnect the other node from the bus, or disconnect the STM32F767xx from the bus, the CAN object instantiation continues as expected. Also, if the other CAN node isn't broadcasting at STM32F767xx start-up, then instantiation continues as expected. It acts like the instantiation is getting hung in an infinite loop, like a while loop. There appear to be several while loops in the int can_frequency(can_t *obj, int f) and the static unsigned int can_speed(unsigned int pclk, unsigned int cclk, unsigned char psjw) in mbed-os\targets\TARGET_STM\TARGET_STM32F1\can_api.c that both seem to be called in initialization. I'm not sure if these could cause the hang up though.

I'm confident in my wiring because when the LPC1768 is used in the exact same way in the exact same circuit, these problems don't occur. The CAN object instantiation occurs in the exact same manner regardless of the bus's initial conditions when using the LPC1768.

Thanks for your help. Let me know if you have any additional questions or if I can clarify further for you.

bperry730 on 6 Mar 2017

Sorry, I meant these functions are in mbed-os\targets\TARGET_STM\TARGET_STM32F7\can_api.c (our target), but they are in mbed-os\targets\TARGET_STM\TARGET_STM32F1\can_api.c as well.

bperry730 on 6 Mar 2017

Thnak you both for your explanations. We'll have to look deeper into this.
@bperry730 , fyi, those can_api.c have been factorized in TARGET_STM folder in the current version of the ARMmbed mbed-os

adustm on 7 Mar 2017

I did some further investigation. and found where the CAN object initialization hangs. It get stuck in the while loop on line 193 in int can_frequency(can_t *obj, int f) in mbed-os\targets\TARGET_STM\TARGET_STM32F1\can_api.c. I'm not sure why it hangs here. Does anyone have an idea? Potentially we could clear the CAN buffer here?

I've attached my gdb screen captures. The second one shows the stack trace. Let me know how I can help.

can_hang2

bperry730 on 31 Mar 2017

I did some further testing, if I change line 79 in can_api.c to the correct CAN frequency (the one I usually set after instantiation), then it no longer hangs. One potential fix would be to add another constructor to the mbed CAN object that accepts a baud rate, and set the CAN frequency in that constructor (similar to mbed Serial). However, that wouldn't solve the problem if you don't set the CAN frequency correctly on instantiation.

bperry730 on 7 Apr 2017

Hi @0xc0170
We have a request to change the interface of CAN api over here. I agree with @bperry730 that we could directly set the correct and requested frequency at the init of the CAN object, instead of calling the frequency function later. Do you think it would be acceptable at mbed level ?
Something in CAN.h that would look like:
CAN(PinName rd, PinName td, int hz);

@bperry730
I eventually reproduced the issue, and I think I understand what is going on.
Your deviceB (that sends messages at 500kbs every 100ms) is loading CAN bus

DeviceA (NUCLEO_F767ZI) is doing:

        can->MCR |= CAN_MCR_INRQ ;
        while ((can->MSR & CAN_MSR_INAK) != CAN_MSR_INAK) {
        }
        can->BTR = btr;
        can->MCR &= ~(uint32_t)CAN_MCR_INRQ;

        while ((can->MSR & CAN_MSR_INAK) == CAN_MSR_INAK) {
        }
        return 1;

1st while is moving to initmode (stop impacting CANbus).
Then we can write in BTR register
Then we want to go back to normal mode (2nd write to MCR + 2nd while loop) but it is never happening.
The reference manual of STM32F76xx says in chapter 40.4.2:

The bxCAN enters Normal mode and is ready to take part in bus activities when it
has synchronized with the data transfer on the CAN bus. This is done by waiting for the
occurrence of a sequence of 11 consecutive recessive bits (Bus Idle state). The switch to
Normal mode is confirmed by the hardware by clearing the INAK bit in the CAN_MSR
register.

I think that this bus idle state never occurs when DeviceB is overloading the bus like you do for your test.
Is it something that can happen in the _real life_ ?
Kind regards

adustm on 10 Apr 2017

Hi @adustm

I'll try to answer your last question:

Is it something that can happen in the real life ?

We actually encounter this issue in our current system. We get some CAN messages on power up from our Device B that can cause Device A to hang. Also, if Device A restarts during operation (something that is possible, but not ideal), then Device B is blasting data at 1kHz and causes Device A to hang.

Let me know if you have any other questions or if I can help in any way?

Thanks.

bperry730 on 10 Apr 2017

👍1

Hello,
I've created a PR to be able to initialize with the correct frequency. (#4165)
I parallel, as @ohagendorf stated that it would not be sufficient, I've added a timeout management and an error check in the following branch:
https://github.com/adustm/mbed/tree/can_sync_error

Would it be possible that you use it and check if you get the error code ?

Kind regards
Armelle

adustm on 28 Apr 2017

bump?

adustm on 11 May 2017

This problem has been reintroduced in the latest mbed-os version. The can frequency is hard-coded again. Should I open a new issue?

bperry730 on 27 Jun 2017

Nope. Same problem, same issue.

theotherjimmy on 27 Jun 2017

@bperry730 Could you tell us what version of mbed OS you have seen this problem re-appear in?

theotherjimmy on 27 Jun 2017

It looks like it was messed up in the merge. So the original pull request had a merge error, it seems. If you change line 89 to if (can_frequency(obj, hz) != 1) {, it should work fine.

bperry730 on 27 Jun 2017

Would you like to submit that PR? The diff looks very reviewable to me!

theotherjimmy on 27 Jun 2017

I would love to. However, I cannot push to the repo as I'm not a contributor. I accepted the terms and conditions, but I don't think it's gone through yet. I'm not sure how that process works or how long it takes.

I'm assuming the process is to push to a separate branch and then pull request in to merge. Let me know if you have any other ideas on how I can push the quick change.

bperry730 on 28 Jun 2017