Mbed-os: Division operation much slower with microlib than other libs on cortex-M0

Created on 8 Jan 2020  路  23Comments  路  Source: ARMmbed/mbed-os

Description of defect

Following investigations in #12116 , it turns out that division with microlib seems much slower than ARM lib (and possibly also other libs) on cores like cortex-M0.

Few measurement on a "divided by 1000" operation were measured and are reported here for background information:

Board | Toolchain | Division | Operation time (incl. IOs toggling) | Frequency during wake-up
-- | -- | -- | -- | --
L476RG | uARM | Native ( / ) | 440ns | 48MHz
L476RG | uARM | Div1000 | 480ns | 48MHz
L476RG | ARMC6 | Native ( / ) | 440ns | 48MHz
L476RG | ARMC6 | Div1000 | 480ns | 48MHz
L073RZ | uARM | Native ( / ) | 86碌s | 4MHz
L073RZ | uARM | Div1000 | 226碌s | 4MHz
L073RZ | uARM | Div10 (3 times) | 18碌s | 4MHz
L073RZ | ARMC6 | Native ( / ) | 11,6碌s | 4MHz
L073RZ | ARMC6 | Div1000 | 13碌s | 4MHz
L073RZ | ARMC6 | Div10 (3 times) | 18碌s | 4MHz

Target(s) affected by this defect ?

ALL

Toolchain(s) (name and version) displaying this defect ?

microlib toolchain (碌ARM)

What version of Mbed-os are you using (tag or sha) ?

mbed-os master of Jan 2020.

What version(s) of tools are you using. List all that apply (E.g. mbed-cli)

Not relevant here.

How is this defect reproduced ?

See detailed history in #12116

IOTOSM-2945 OPEN mirrored bug

All 23 comments

What would be suggestion here? From the link in the #12116, this loos like it should be known issue with microlib.

It might be worth reaching out internal team who is in charge for microlib ?

For division, microlib is doing what's being advertised, which is being smaller and slower. And I'm inclined to think the trade-off is okay. It's 40 bytes instead of 340 bytes. 8 times slower (in this case), 8 times smaller. 8 times slower isn't enough to be that much of a problem, except when running at like 4MHz.

I'm more concerned about the 64-bit multiply (which the div1000 was using there). That is tens of times slower, AND many times bigger. It would be better in all regards to just use the standard library routine. We should raise this with the tools team.

That only seems to be a problem for M0/M0+/M23 - the microlib library routine for M3 is also catastrophically slow, but the compiler seems to not use it - it just inlines optimal code instead.

In the interim, we could paste in the standard library routines with #if MICROLIB somewhere - that should override the currently-poor microlib definitions.

Note we're waiting from Arm to comment and confirm how to proceed.

Note this affects all Cortex-M0/M0+.
@kjbracey-arm can you please review?

@kjbracey-arm any update on this?

This issue seems to also affect cortex M4 (STM32F303 for example). See #12880

A fix for this is now expected in a future ARM Compiler 6 update.

Current expectation is that future microlib will have the same smaller+faster 64-bit multiply as the standard library. No Mbed OS change required there.

It will default to the the current division operations, but the faster+bigger versions from the standard library will be selectable by an IMPORT directive, which we'll have to add. That in turn could be put under a Mbed config option. So there will need to be a PR for that when the toolchain is available.

@LMESTM thank you for raising this issue.Please take a look at the following comments:

We cannot automatically identify a release based on the version of Mbed OS that you have provided.
Please provide either a single valid sha of the form #abcde12 or #3b8265d70af32261311a06e423ca33434d8d80de
or a single valid release tag of the form mbed-os-x.y.z .
E.g. '2020.' has not been matched as a valid tag or sha.
NOTE: If there are fields which are not applicable then please just add 'n/a' or 'None'.This indicates to us that at least all the fields have been considered.
Please update the issue header with the missing information, the issue will not be mirroredto our internal defect tracking system or investigated until this has been fully resolved.

@LMESTM as this appears to be more of a compiler issue can we close this now ?

@LMESTM it has been 5 days since the last reminder. Could you please update the issue header as previously requested?

The latest release of Arm Compiler is 6.15 (this versin contains the fix). I reviewed the internal tickets, it should be in the release.

I'll close this now as resolved.

AFAIK, you will need to actually activate the fast division routines, by adding some sort of import directive somewhere. The libraries will default to slow.

(And there might be toolchain-backward-compatibility issues with that. Maybe you can make it a weak reference to not break old toolchains?)

From my comment above:

It will default to the the current division operations, but the faster+bigger versions from the standard library will be selectable by an IMPORT directive, which we'll have to add. That in turn could be put under a Mbed config option. So there will need to be a PR for that when the toolchain is available.

@LMESTM it has been 5 days since the last reminder. Could you please update the issue header as previously requested?

We will need to review the latest release and how to make them enabled.

@LMESTM it has been 5 days since the last reminder. Could you please update the issue header as previously requested?

@LMESTM it has been 5 days since the last reminder. Could you please update the issue header as previously requested?

@LMESTM it has been 5 days since the last reminder. Could you please update the issue header as previously requested?

@LMESTM it has been 5 days since the last reminder. Could you please update the issue header as previously requested?

@LMESTM it has been 5 days since the last reminder. Could you please update the issue header as previously requested?

@LMESTM it has been 5 days since the last reminder. Could you please update the issue header as previously requested?

@LMESTM it has been 5 days since the last reminder. Could you please update the issue header as previously requested?

@LMESTM it has been 5 days since the last reminder. Could you please update the issue header as previously requested?

Thank you for raising this detailed GitHub issue. I am now notifying our internal issue triagers.
Internal Jira reference: https://jira.arm.com/browse/IOTOSM-2945

Was this page helpful?
0 / 5 - 0 ratings

Related issues

DuyTrandeLion picture DuyTrandeLion  路  3Comments

bcostm picture bcostm  路  4Comments

cesarvandevelde picture cesarvandevelde  路  4Comments

davidantaki picture davidantaki  路  3Comments

MarceloSalazar picture MarceloSalazar  路  3Comments