When looking at the code for crypto hardware acceleration, we noticed that there are many locations where a command is issued to the accelerator, then the thread must must wait for the operation to be completed. The mbed TLS API is mostly synchronous, so for lack of an alternative mechanism, the thread simply is set to busy wait until the accelerator has finished. This behaviour is inefficient as those CPU cycles could be used to execute another thread instead of manually checking on a condition.
Suggested enhancement
An idea to avoid the busy waits mentioned above is to have some mechanism that enabled HAL code to issue a command for the accelerator or a peripheral and then wait on a condition that would be satisfied (e.g. by means of a semaphore) when the operation has completed.
Pros
Threads can more efficiently be scheduled and blocking in the form of busy waiting for events to occur can be avoided in a much easier way.
Cons
Probably there might be issues with backwards compatibility when adding this new feature.
That's a very good point, I would say we should think about using RTOS in HAL, but that cannot happen as long as we support mbed 2.
Maybe we can temporarily introduce a wrapper for semaphore that would expand to RTOS sempahore if present and to busy loop for mbed 2?
@sg- @c1728p9 @0xc0170
[Mirrored to Jira]
I would rather keep HAL API simple and build on top of it - provide functionality to allow to be running in background.
@andresag01 Where in HAL is this? Can you pinpoint to our code base?
[Mirrored to Jira]
@0xc0170: The crypto hardware acceleration code is full of busy wait loops while we wait for the accelerator to complete commands. For example: here. Also consider looking at the open PRs for crypto hardware accelerator.
[Mirrored to Jira]
I would rather keep HAL API simple and build on top of it - provide functionality to allow to be running in background.
That wouldn't solve the issue with busy waiting i.e wasted CPU cycles which can be use for other processing or save power.
@bulislaw The event middleware offers an interesting solution which avoid wasting CPU cycles with or without the RTOS present (see here)
[Mirrored to Jira]
The event middleware offers an interesting solution which avoid wasting CPU cycles with or without the RTOS present (see here)
+1 for anything that brings us back closer to the event driven dream
[Mirrored to Jira]
@loverdeg-ep The link I've posted is an implementation of a semaphore with or without the RTOS. I wasn't advocating for the direct use of the event middleware to solve that issue. Apologies for the unclear message.
I don't think the use of the event middleware would be a sensible decision at the HAL level. Of course it would be feasible to have asynchronous API at that level however, due to the synchronous nature of mbed tls functions, the glue sitting between mbed tls and the HAL would have to act like a semaphore/condition variable.
[Mirrored to Jira]
@pan- Thanks for clarification; I wasn't necessarily implying that it be used directly either.
More so implying that preference is to keep things asynchronous and RTOS as optional as possible.
[Mirrored to Jira]
Internal Jira reference: https://jira.arm.com/browse/IOTCRYPT-223
Referencing here latest work that my help to fix this issue: https://github.com/ARMmbed/mbed-os/pull/10104
Glad to see this bumped
@trowbridgec
Also, the already existing PlatformMutex feature can be used to synchronise as well
@geky any application for your asynchronizer thingy here?
Reason for closure
@linlingao
Looks like @oliverjharper closed it. Oliver?
@linlingao I closed the IOTCRYPT ticket, Jira has mirrored the closure here as well. @Patater and the crypto team are not currently working on this. Perhaps the Jira ticket should be re-opened and moved to the most appropriate Jira project?
I'm reopening this issue here. @oliverjharper Let's work out Jira handling offline.
@geky any application for your asynchronizer thingy here?
Ah, no actually. coru doesn't preempt code.
The reason I called out mbedtls in the coru docs is because this issues is also a problem if you try to run mbedtls in a coroutine environment.
You need to sprinkle CPU-bound code with some form of yield call in order to be friendly with other coroutines/threads. The closest thing Mbed OS has to this is wait_ms(1) (unless this has changed since the last time I've checked, which is very likely).
The only way to make libraries share the CPU without modifying code is preemption. IMO this is heavy-handed for Cortex-M systems. Most libraries are not CPU-bound, usually waiting on drivers. MbedTLS and uTensor being notable exceptions.
One way to approach this is to model your CPU resource as a driver. With coroutine-friendly yielding while "polling" the "driver". This could then be extended to take advantage of hardware acceleration.
Though I'm not familiar enough with MbedTLS to know how difficult this is to introduce.
@geky Reason you consider preemption heavy handed?
I assume reasoning is M0+ and M0 would not support preemption, though I believe M23 has one or three priority levels at its disposal.
Thank you for raising this issue. Please note we have updated our policies and
now only defects should be raised directly in GitHub. Going forward questions and
enhancements will be considered in our forums, https://forums.mbed.com/ . If this
issue is still relevant please re-raise it there.
This GitHub issue will now be closed.
Most helpful comment
Referencing here latest work that my help to fix this issue: https://github.com/ARMmbed/mbed-os/pull/10104