Mbed-os: Tickless mode support proposal

Created on 29 Dec 2016  路  39Comments  路  Source: ARMmbed/mbed-os

Couple of months ago while working on tickless sleep mode we faced couple of issues around implementation of low power timer:

  • accuracy
  • working mode
  • availability

and sleep:

  • wake up time
  • availability

It's a mixture of platform differences and no clear requirements for ticker and sleep implementation for targets in mbed.

To address that I'm proposing introducing an extra timer, which would be an alias for one of the existing timers (eg LP timer) and sleep mode, with clear implementation requirements (enforced by tests). Targets wishing to support tickless sleep mode, will need to declare TICKLESS capability and implement the tickless timer and sleep mode.

Tickless timer and sleep will need to conform to following requirements:

  • Clock:

    • needs to stay up during tickless sleep

    • Resolution of less than 1ms (?or maybe at least driven of 32k crystal?)

  • Sleep

    • Needs to be able to wake up of the tickless timer

    • Needs to wake up in given time (?microseconds?)

Having a implementation on different targets conforming to above specification would mean that we can have one coherent tickless implementation that works on majority of boards. To assure that all targets declaring TICKLESS capability actually fulfill the requirements we would implement tests, checking the clock resolution and accuracy and also ability and latency of waking the board up from associated sleep mode.

As a alternative, more generic, solution we could drop the new timer alias and define official requirements for the different timers and sleep modes that most platform implement already. That would help us maintain consistent behavior across the boards and features.
Downside of this proposal is that not all targets could implement timers/sleep modes as dictated by our requirements, which would lead us to either relaxing our specification or not implement given feature for some platforms.

Maybe hybrid approach would be the best solution, define rather relaxed but mandatory requirements for timers and sleep modes so we achieve some level of consistency. On top of that define strict requirements for extra features or capabilities (like tickless mode), which could be simple aliases for existing implementation or more complicated constructs.

@sg- @0xc0170 @c1728p9

Most helpful comment

Did a small test recently using the Nordic Power Profiler Kit and an nRF52-DK following the above tips. mbed OS in its default configuration is pretty power hungry, good for IoT novices, but needing a lot of tweaking for production. I also haven't had a chance to try the tickless timer yet, but I really hope this makes it into a future release soon.

Results here: https://vilimpoc.org/blog/2017/04/24/power-profiling-on-mbed-nordic-nrf5-series/

All 39 comments

Some additional details to be considered (some things from this different API are touched above):

What we had for mbed 3 was a bit different lp ticker API , as minar was a only consumer. https://github.com/ARMmbed/mbed-hal/blob/master/mbed-hal/lp_ticker_api.h plus some additional details (target specific) like Minimum_sleep, Time_Base (ticks per second). The lp ticker resolution was ticks, a user API was in milliseconds.
The lp ticker API was simple, read the time, set the match interrupt, get the matched interrupt, sleep until functionality (this was tricky on k64f platform as there for some periods 2 timers were required - wake up with one, and set another one for the rest of the period) and get overflows (to be able to form 64bit timer).

Sleep API should take care of choosing the best sleep available for the current system use. However we have here sleep() and deep_sleep() - it's not user friendly much: "what sleep do I do now, why do I have to choose?". mbed 3 defines sleep_enter() and sleep_exit(). Why sleep provides enter/exit? Because prior the sleep there might be required to store some information about the current system or do some calculation or fiddling clocks. Important detail - sleep_enter() goes to lowest sleep possible. One entry point for an app for sleep (for RTOS for instance as well).

I liked having a ticker in ticks, and a use api for calculating ms->ticks / ticks->ms (resolution milliseconds for a user). Therefore targets do not need to do some extra calculations in the target code. And having get_overflows to have 64bit time (in most cases it would be - upper 32bits are overflows count and lower 32bits timers counter). This ticker would be always running (having that 64bit time available any time), and generating interrupts only for matched time (go to sleep until). The only consumer of this timer could be RTOS and events, this would deprecate LowPowerXXX classes.

I generally agree with above, couple of points:

  • Having someone fiddling with the tickless timer is not good, but can be worked around if we shape the API in a way that won't let people reset it or something like that. But I agree having it hidden would be probably the best idea.
  • We should have different sleep modes with different power savings, and the platform can't pick the right one because it doesn't know the intent eg go to sleep and wake up on external interrupt, go to sleep for 5ms and wake up in 1us, go to sleep for hours and take as much time as you want to wake up. Having well defined 2 or 3 sleep modes with instructions how to use it would help us save power.
  • I'm not sure about ticks as we already have system ticks so introducing another kind of ticks would be very confusing, unless we define everything in system ticks (1ms by default).
  • Having 64bit resolution would be definitely good thing.

I am assuming this is still in a proposal stage? Has there been any progress on considering implementing the Tickless mode in the near future for the mbed-os 5? The reason I ask is I am using the NRF52832 BLE module and one of the reasons we have been told is that NRF52 BLE module doesn't go long into sleep is because of the lack of this feature in the MBED OS5. Here is a link to the issue, I was hoping the MBED team can comment on when this feature/proposal would be even considered and if there is a strong mandate to have this implemented, what would be the approximate timeline as to when this may be available and if not I was hoping to there is some kind of a work around other than not using RTOS in my application.

Thanks,
Yogesh

It will be, but I can't commit to a timeline at this stage. In current implementation the platform will go to sleep if theres nothing to process, but it will be woken up for SysTick processing, after which it may go back to sleep if there's nothing else to do.

Thanks Bartek for your response. Is there a specific version of mbed that it is slated for? The reason I ask this it has become a blocker for us at this point using mbed-os-5 on Nordic platform. The only other solution is to ignore the RTOS from the build. We use some of the features of the RTOS today in our application, we will have to figure out a way to do without them and I don't know how much feasible this would be over the long run, other than move away from mbed-os 5 for the nordic platform until this feature is available as this with the current consumption now, it would burn thru'batteries in a matter of weeks to months, even when it supposed to be at sleep.

Thanks once again for at least acknowledging that its on the radar to be potentially implemented.

Thanks,
Yogesh

@yogeshk19 Current version of mbed OS use RTX v1.
Tickless operations are explained here.

On NRF52, this would to translate to something like this (untested):

```c++
// import the time duration between two ticks (in us).
extern const uint32_t os_clockrate;

void dummy_cb() { }

void os_idle_demon (void) {
// use int rather than timestamp_t because units are not coherent
// between Timer and Timeout ...
const int max_us_sleep = (INT_MAX / os_clocrate) * os_clockrate;
Timer stopwatch; // keep track of the time asleep
Timeout alarm_clock; // will awake the uc if no interrupts does it before

// never ends, the rtos will suspend this thread when there is something to do
// either before os_suspend actually suspend the system (and is not in svc) 
// or immediately after os_resume  
while (true) {
    // suspend the system 
    uint32_t ticks_to_sleep = os_suspend();
    uint32_t elapsed_ticks = 0;

    if (tick_to_sleep) { 
        uint64_t us_to_sleep = ticks_to_sleep * os_clockrate; 

        if (us_to_sleep > max_us_sleep) { 
            us_to_sleep = max_us_sleep;
        }

        // start the stopwatch and setup the alarm_clock to wakeup the uc in us_to_sleep
        stopwatch.start();
        alarm_clock.attach_us(dummy_cb, us_to_sleep);

        // go to sleep, most of the work is done by the softdevice 
        sleep();

        // after sleep, unknown wake up source, can be the stopwatch or another IRQ
        int us_asleep = stopwatch.read_us();

        // stopwatch and alarm_clock cleanup
        stopwatch.stop();
        stopwatch.reset();
        alarm_clock.detach();

        // translate us asleep into ticks 
        elapsed_tick = us_asleep / os_clockrate;
    }

    // resume the system 
    os_resume(elapsed_tick);
}

}

int main() {
Thread::attach_idle_hook(os_idle_demon);
// your code ...
}
````

We've made an attempt in the past to implement it but it had some issues (see here). The algorithm I've put in this comment is different from the one in the PR linked: it will not sleep forever until the timeout is reached, instead if an IRQ happens then the system is resumed because some new work might be available for the RTOS.
It might worth it to try it.

@pan- Vincent,
Thanks a lot for taking the time and putting this down. I would definitely give it a shot and see if this helps. I get some compilation errors, however I resolved it, but not sure if I am using the right includes, and there were few typos in the variable names, which I have resolved. However was wondering if you could please confirm if I am using the right header files?

#include <mbed.h>
#include "ble/BLE.h"
#include "nrf_soc.h"
#include "rt_TypeDef.h"
#include "RTX_Config.h"

//I took this definition from one of the NRF target files and I assuming since its 32 bits that can
//be the max values for an unsigned integer.
#define INT_MAX 0xFFFFFFFF

// import the time duration between two ticks (in us).
extern const uint32_t os_clockrate; 

void dummy_cb() { }

void os_idle_demon (void) {
    // use int rather than timestamp_t because units are not coherent 
    // between Timer and Timeout ...
    const int max_us_sleep = (INT_MAX / os_clockrate) * os_clockrate; 
    Timer stopwatch;      // keep track of the time asleep
    Timeout alarm_clock;  // will awake the uc if no interrupts does it before

    // never ends, the rtos will suspend this thread when there is something to do
    // either before os_suspend actually suspend the system (and is not in svc) 
    // or immediately after os_resume  
    while (true) {
        // suspend the system 
        uint32_t ticks_to_sleep = os_suspend();
        uint32_t elapsed_ticks = 0;

        if (ticks_to_sleep) { 
            uint64_t us_to_sleep = ticks_to_sleep * os_clockrate; 

            if (us_to_sleep > max_us_sleep) { 
                us_to_sleep = max_us_sleep;
            }

            // start the stopwatch and setup the alarm_clock to wakeup the uc in us_to_sleep
            stopwatch.start();
            alarm_clock.attach_us(dummy_cb, us_to_sleep);

            // go to sleep, most of the work is done by the softdevice 
            sleep();

            // after sleep, unknown wake up source, can be the stopwatch or another IRQ
            int us_asleep = stopwatch.read_us();

            // stopwatch and alarm_clock cleanup
            stopwatch.stop();
            stopwatch.reset();
            alarm_clock.detach();

            // translate us asleep into ticks 
            elapsed_ticks = us_asleep / os_clockrate;
        }

        // resume the system 
        os_resume(elapsed_ticks);
    }
}

Thanks,
Yogesh

Yeah, like @pan- mentioned (thanks for that), it's pretty straightforward to do per board. The tricky bit is to make it work for range of them. I think we did a wrong thing when we tried it the first time by trying to make it work for all the boards, rather than add device capabilities and activate it for boards it actually works for and expand the set of boards with time.

@yogeshk19 #include <mbed.h> should be enough to compile the os_idle_demon function. I've tested the code today, it pass all the mbed test on various platforms (K64F, NRF52_DK and NUCLEO_F411RE) with all the compiler (IAR, ARM and GCC_ARM) and it breaks nothing.

Power consumption improved a lot but I need to do more precise measure to understand precisely how much it consume with this code.

Did you have a chance to try the code and do current measures on your side ?

@pan- Thanks it magically compiles now, it wasn't last night :). However I had to add a definition for #define INT_MAX 0xFFFFFFFF to my code base, otherwise it gives a compilation error and I get a warning for this comparison below indicating we are comparing an integer with an unsigned integer. Did you intend for the max_us_sleep to be a signed integer?
if (us_to_sleep > max_us_sleep)

Also I had a question regarding this piece of code. I am guessing since you have already tested this code, it seems like multiplying and dividing by the os_clockrate is what you intended?
// use int rather than timestamp_t because units are not coherent // between Timer and Timeout ... const int max_us_sleep = (INT_MAX / os_clockrate) * os_clockrate;

I am about to run this code against my board and will let you know if we see the improvement in the sleep power usage. Also just wanted to check with you, we have an external oscillator connected to the nrf52832 ble module on pins P0.0 (xtal1) and P0.1 (xtal2) which is 32.768khz crystal, would that make any difference in our current consumption?

Thanks,
Yogesh

@yogeshk19 sorry, I forgot INT_MAX is in the headed <limits.h> and it is not equal to 0xFFFFFFFF but 0x7FFFFFFF.

Did you intend for the max_us_sleep to be a signed integer?

Yes it was intended because Timer::read_us return an int using similar unit seems to be a good thing.
if (us_to_sleep > (uint32_t)max_us_sleep) seems more appropriate, max_us_sleep is a positive number.

Also I had a question regarding this piece of code. I am guessing since you have already tested this code, it seems like multiplying and dividing by the os_clockrate is what you intended?

// use int rather than timestamp_t because units are not coherent // between Timer and Timeout ... const int max_us_sleep = (INT_MAX / os_clockrate) * os_clockrate;

Yes, integral division then multiplication. That way, max_us_sleep is a multiple of os_clockrate.
It is just to be a bit more precise if the device has to sleep for a period of time exceeding INT_MAX micro seconds.

Thanks @pan- Vincent. I have included the changes, you have made and hopefully we will hear something good from the team testing it and should shortly let you know how the test goes.

Thanks,
Yogesh

@pan- Vincent,

Alright I have great news your code above rocks and the power consumption dropped significantly into the low 3- 7 UA's. We will be doing more extensive testing in the coming days to get a sense of the actual numbers. The only thing weird thing is that when I used the build generated by the online compiler the power consumption is 300UA's more than when I build the exact same code offline using GCC_ARM using MBED-CLI. I can't explain it, can you?

Thanks,
Yogesh

@yogeshk19 really happy that it works. I wasn't able to have precise current consumption bellow 100ua.

The only thing weird thing is that when I used the build generated by the online compiler the power consumption is 300UA's more than when I build the exact same code offline using GCC_ARM using MBED-CLI. I can't explain it, can you?

My wild guess: the UART (more likely...) or another peripheral is enabled during the initialization when armcc is used. Might worth to look in a debugger what peripherals are active. I will check that tomorrow.

@pan- Vincent,

Did you get a chance to see what peripherals are active? I use the online compiler almost 95% of my development and its worrisome to see that there is 300ua's difference in current consumption when building offline. Let me be clear, the build coming out of the offline compiler and applying that firmware to NRF52 draws less current compared to the build that comes out of the online compiler.

Please let me know if you get a chance to figure out why this is happening and that way we can decide to use the online compiler or not for our future development or if there is someway to ensure the two builds give identical results in terms of functionality. If you would prefer I post this in the mbed forum I could do that instead of using this thread for troubleshooting this issue?

Thanks,
Yogesh

I tried to apply the changes above to the latest BLE_Button example. Power consumption went down a lot but the LED stopped blinking after about 8200 seconds of running. It stayed off for about 780 seconds when it started blinking again. I suspect it would have continued periodically going off every 8200 secs and come back again. It seems the EventQueue stops firing and the whole thing is very similar to what I described in #3857.

Yogesh, have you seen anything similar with you application?

Here's the diff of my code to the original BLE_Button example. Please, let me know if I'm doing something wrong.

-Tam谩s

diff --git a/BLE_Button/.mbed b/BLE_Button/.mbed
index e87b56d..0ba62b0 100644
--- a/BLE_Button/.mbed
+++ b/BLE_Button/.mbed
@@ -1 +1,3 @@
 ROOT=.
+TARGET=NRF52_DK
+TOOLCHAIN=GCC_ARM

diff --git a/BLE_Button/source/main.cpp b/BLE_Button/source/main.cpp
index a5ef985..21e80db 100644
--- a/BLE_Button/source/main.cpp
+++ b/BLE_Button/source/main.cpp
@@ -16,6 +16,7 @@
 #include <events/mbed_events.h>

 #include <mbed.h>
+#include <limits.h>
 #include "ble/BLE.h"
 #include "ble/Gap.h"
 #include "ButtonService.h"
@@ -95,8 +96,62 @@ void scheduleBleEventsProcessing(BLE::OnEventsToProcessCallbackContext* context)
     eventQueue.call(Callback<void()>(&ble, &BLE::processEvents));
 }

+// import the time duration between two ticks (in us).
+extern const uint32_t os_clockrate; 
+
+void dummy_cb() { }
+
+void os_idle_demon(void)
+{
+    // use int rather than timestamp_t because units are not coherent 
+    // between Timer and Timeout ...
+    const int max_us_sleep = (INT_MAX / os_clockrate) * os_clockrate; 
+    Timer stopwatch;      // keep track of the time asleep
+    Timeout alarm_clock;  // will awake the uc if no interrupts does it before
+
+    // never ends, the rtos will suspend this thread when there is something to do
+    // either before os_suspend actually suspend the system (and is not in svc) 
+    // or immediately after os_resume  
+    while (true) {
+        // suspend the system 
+        uint32_t ticks_to_sleep = os_suspend();
+        uint32_t elapsed_ticks = 0;
+
+        if (ticks_to_sleep) { 
+            uint64_t us_to_sleep = ticks_to_sleep * os_clockrate;
+
+            if (us_to_sleep > (uint32_t) max_us_sleep) {
+                us_to_sleep = max_us_sleep;
+            }
+            // start the stopwatch and setup the alarm_clock to wakeup the uc in us_to_sleep
+            stopwatch.start();
+            alarm_clock.attach_us(dummy_cb, us_to_sleep);
+
+            // go to sleep, most of the work is done by the softdevice 
+            sleep();
+
+            // after sleep, unknown wake up source, can be the stopwatch or another IRQ
+            int us_asleep = stopwatch.read_us();
+
+            // stopwatch and alarm_clock cleanup
+            stopwatch.stop();
+            stopwatch.reset();
+            alarm_clock.detach();
+
+            // translate us asleep into ticks 
+            elapsed_ticks = us_asleep / os_clockrate;
+        }
+
+        // resume the system 
+        os_resume(elapsed_ticks);
+    }
+}
+
 int main()
{
+    Thread::attach_idle_hook(&os_idle_demon);
+
     eventQueue.call_every(500, blinkCallback);

     BLE &ble = BLE::Instance();

@yogeshk19 I can confirm that the serial is initiated during the boot sequence when armcc is used.

@c1728p9 during the initialization of the c runtime of armcc, _sys_open is called with the fd 0 (stdout). It force the opening of the default UART peripheral even if it not used after. The consequence is an increase of the power consumption. Would it be possible to delay the initialization of the default serial to the point where something is actually written or received ?

@c1728p9 during the initialization of the c runtime of armcc, _sys_open is called with the fd 0 (stdout). It force the opening of the default UART peripheral even if it not used after. The consequence is an increase of the power consumption. Would it be possible to delay the initialization of the default serial to the point where something is actually written or received ?

This is known limitation that I discovered (when we introduced lazy init for stdio object into mbed-drivers). The ARMCC startup calls _initio that initializes stdio if used (I wonder what they mean by "used"). We shall look at how to postpone this init until it's used. Any pointers appreciated !

@pan- Hi Vincent, Is the tickless version of the code that your provided earlier has it been committed into any of the mbed-os branch, so that its a more an official mbed change? Also the UART issue in the online compiler, has there been any changes made such that the UART is not enabled by default?

Thanks,
Yogesh

Did a small test recently using the Nordic Power Profiler Kit and an nRF52-DK following the above tips. mbed OS in its default configuration is pretty power hungry, good for IoT novices, but needing a lot of tweaking for production. I also haven't had a chance to try the tickless timer yet, but I really hope this makes it into a future release soon.

Results here: https://vilimpoc.org/blog/2017/04/24/power-profiling-on-mbed-nordic-nrf5-series/

Wanted to add a quick note. After trying the FreeRTOS example ble_app_hrs_freertos from the nRF5 v13 SDK, it's clear to me that without tickless timer support, no RTOS is going to be usable on battery power only for long periods of time in production. FreeRTOS also used ~6mA in idle mode. I really hope tickless scheduling makes it into a future release along with, perhaps, timer coalescing (assuming the ability to specify soft realtime tasks, though that is asking alot 馃槃).

@yogeshk19 No it has not been committed anywhere, it can requires more refinement (and maybe changes to the HAL) to be fully portable to all platforms supported by mbed OS.

@pan- Hi Vincent,

I understand that this patch is for NRF51822

But my target is F091RC.

I took your patch and tried to compile. I get following error.

./main.cpp: In function 'void os_idle_demon()':
./main.cpp:137:41: error: 'os_clocrate' was not declared in this scope
     const int max_us_sleep = (INT_MAX / os_clocrate) * os_clockrate; 

I see that the variable os_clockrate is specific to NRF* targets. Can i quickly adapt this patch to Nucleo(F0) targets? that would be really nice until we have tickless mode supported.

Thank you.

@binary-nerd This variable was part of the RTX kernel V1 but isn't in V2. Instead you can use the define OS_TICK_FREQ.

Thanks @pan-

OS_TICK_FREQ was not defined. Instead i could find os_tickfreq which is initialized to OS_CLOCK

Hence extern const uint32_t os_tickfreq is compiling well.

Thank you very much for the patch.. :)

OS_TICK_FREQ is always defined, it lives in RTX_Config.h (see).

  • OS_CLOCK: Clock speed of the MCU (in Hz)
  • OS_TICK_FREQ: Tick frequency (in Hz)

@pan- As of mbed os 5.5.2 I am not able to compile the earlier tickless mode code that you had provided. I get the following two errors. Have these methods been removed or deprecated or moved to different header files?

Error: Identifier "os_suspend" is undefined in "main.cpp", Line: 34, Col: 36
Error: Identifier "os_resume" is undefined in "main.cpp", Line: 89, Col: 10

Thanks,
Yogesh

@yogeshk19 mbed-os 5.5 introduce a new version of the RTX kernel which is not backward compatible with the previous one. You can replace os_suspend and os_resume by osKernelSuspend and osKernelResume.

@pan- Thanks for getting back so quickly and that fixed the compilation errors. However I am getting another linker error. Has this been renamed too?

Error: Undefined symbol os_clockrate (referred from main.NRF52_DK.o).

Thanks,
Yogesh

@yogeshk19 You can use OS_TICK_FREQ instead of os_clockrate.

@pan- Vincent, thanks a lot for your quick responses. I did try using the OS_TICK_FREQ and I am not sure if I should be including any specific header files for this, but It doesn't seem to compile with this change.

Its added as an extern variable as shown below in the code below, as you had done it in the past.

extern const uint32_t OS_TICK_FREQ;

and I get the error: Error: Expected an identifier "S_TICK_FREQ" in "main.cpp", Line: 9, Col: 24. I am not sure why it is giving that error?

Thanks,
Yogesh

@pan- Hi Vincent, Any thoughts on why the compilation error is occurring. We are blocked on some of the new OS changes.

Thanks,
Yogesh

and I get the error: Error: Expected an identifier "S_TICK_FREQ" in "main.cpp", Line: 9, Col: 24. I am not sure why it is giving that error?

@yogeshk19 I would assume that you are missing to include a file where it is declared. Most probably the header file (RTX_Config.h) is not in included to the application via mbed.h (If I checked the correct define coming from RTX).

@yogeshk19 It is in RTX_Config.h and it is a macro, not a variable.

Thanks a lot Vincent & Martin for taking the time to respond. That fixed the error :).

Thanks,
Yogesh

@pan- Hey Vincent, I am running my application using the code you had written earlier to handle the main thread idle scenario so as to sleep when the main thread is idle. One of the issues I am facing is on mbed os 5.4.4 is that I use the event queue to call a function every 30 second intervals, however the event queue timer calls are all firing at incorrect times sometimes a few seconds and sometimes several minutes between calls. Btw just to refresh your memory this on NRF52_DK platform.

On the newer OS 5.5 the same piece of code errors out with stack underflow error. So I am not sure what is going on at this point.

Also the with new OS 5.5 the current draw has gone up significantly in spite of the idle mode code present. Any thoughts on how best to address this issues at hand or Let me know if you would rather me open this issue as a separate thread?

Thanks,
Yogesh

@yogeshk19 Please open these issues in a separate thread.

does this cover the current tickless mode? If so, it can be closed.

Note that tickless support has now landed in Mbed OS 5, see the reference manual.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ccchang12 picture ccchang12  路  4Comments

pilotak picture pilotak  路  3Comments

bcostm picture bcostm  路  4Comments

drahnr picture drahnr  路  4Comments

davidantaki picture davidantaki  路  3Comments