Zephyr: nrf: clock control: clock control on/off routines are refcounted

Created on 19 Nov 2019  路  13Comments  路  Source: zephyrproject-rtos/zephyr

Describe the bug

The clock_control_on() and clock_control_off() API contracts say that they respectively enable and disable the clock. There is no mention of refcounting users of the clock.

To Reproduce

  1. call clock_control_on() twice in a row for the same clock on an nRF device
  2. call clock_control_off() once on the same device
  3. observe clock remains enabled because an associated refcount is nonzero

Expected behavior

Per the clock control API contract, the clock should be disabled after step 2. All other clock control drivers I found appear to adhere to this contract.

Impact

Unknown

Additional context

This behavior was introduced in 6700f2f194, around the time the asynchronous API was added.

I'm sure it is being done for perfectly valid reasons for the nRF platform, but it's not correct API behavior.

I'm guessing the purpose of this is to keep the clock enabled as long as anybody is using it, since Zephyr doesn't really have proper device power / clock dependency management done by the operating system. But it's still a bug.

@carlescufi could you please help triage this and assign a priority?

Stale Clock Control bug low

Most helpful comment

When i searched repo i didn't found many uses of clock_control_off except for nordic. I've actually found one (gpio_dw.c) and i think that it is actually missing reference counter there because ports are using same clock driver and if any port disables the clock it will shut it down for all.

clock_control_on and clock_control_off are already poorly documented (e.g. no error codes). Renaming them to release/request (and async_request) would be the cleanest option. It would cover single user (no ref counting) and multi user scenario.

what do you think?
@MaureenHelm @erwango @fvincenzo @anangl

All 13 comments

cc @pizi-nordic @nordic-krch @pabigot -- I couldn't find an existing issue that covers this, but you are all excellent oracles for such things in case I missed one :).

I'm unaware of a global issue for this, though the undesirable behavior identified in #20775 seems to be similar (though using a different technical approach).

@mbolivar of course there is a bug but it's rather in the documentation. Even before https://github.com/zephyrproject-rtos/zephyr/commit/6700f2f1940c645c5cd1eb23baad8bd0c8041b81 nordic driver had the same reference counter.

If clock_control_off would shut down clock without ref counter that would make the clock_control useless for multiple request contexts. On nordic soc hf clock is continuously enabled and disabled by bluetooth, usb, nfc, lf clock calibration.

When async api was introduced i did not modify any existing api's (https://github.com/zephyrproject-rtos/zephyr/pull/16675/commits/d4aaffafd9edb2a32dad727c5e23fb313f45ca75) so it stayed like it was, without mentioning ref counter.

Even before 6700f2f nordic driver had the same reference counter.

Ah, you're right about that part, I missed that in the diff.

@mbolivar of course there is a bug but it's rather in the documentation.

No, it's in our driver. The driver API just says it enables the clock or disables it. Nothing is said about counting users. And as I wrote in the issue description, the nordic driver is the only one that ref counts. Every other driver just turns the clock on and off.

If clock_control_off would shut down clock without ref counter that would make the clock_control useless for multiple request contexts. On nordic soc hf clock is continuously enabled and disabled by bluetooth, usb, nfc, lf clock calibration.

Right, and as I wrote in the issue description:

I'm sure it is being done for perfectly valid reasons for the nRF platform, but it's not correct API behavior.

This was the only way for us to use this API consistently across different modules. Without the reference counting you'd need a higher-layer that tracks the use of the clock, and that layer simply doesn't exist. While I agree that this is essentially breaking the API contract, I don't think we should remove the behaviour. Instead, I agree with @nordic-krch, we should document that some implementations add reference counting to the driver, because the alternative is simply not being able to use this API at all, since counting users is fundamental when you are dealing with a global shared resource like this.

People who see foo_enable()/foo_disable() (#20775) or foo_on()/foo_off() (clock) will expect those functions to enable/disable/turn on/turn off. Given functions with those names reference-counting behavior is just wrong.

If you want reference counting it should be something like foo_request()/foo_release().

Since it's already in the API presumably the only option is to document it and expect people to be confused.

When i searched repo i didn't found many uses of clock_control_off except for nordic. I've actually found one (gpio_dw.c) and i think that it is actually missing reference counter there because ports are using same clock driver and if any port disables the clock it will shut it down for all.

clock_control_on and clock_control_off are already poorly documented (e.g. no error codes). Renaming them to release/request (and async_request) would be the cleanest option. It would cover single user (no ref counting) and multi user scenario.

what do you think?
@MaureenHelm @erwango @fvincenzo @anangl

@nordic-krch I think this is fair. We could deprecate the old names and add the new ones and that should be enough.

Renaming them to release/request (and async_request) would be the cleanest option. It would cover single user (no ref counting) and multi user scenario.

Renaming them is breaking API. Why not add new names instead?

I would expect "renaming" to mean create the new ones, and implement the old ones in terms of the new ones, as suggested by @carlescufi.

FWIW I'm drafting a generic implementation for reference-counted request/release management with synchronous and asynchronous support that I hope can simply be used by whatever modules require it, including the device power management infrastructure. Early days, though; we'll have to see how that goes.

FWIW I'm drafting a generic implementation for reference-counted require/release management with synchronous and asynchronous support that I hope can simply be used by whatever modules require it, including the device power management infrastructure. Early days, though; we'll have to see how that goes.

That would be very nice indeed!

FWIW, I'm partial to _get and _put for such things. Nice and terse.

This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.

Without looking too closely I believe the ref-counted variants required by Nordic are now using a different API, and the conclusion in that context was that nobody else needed ref-counting for clock management. This probably isn't true, but the true needs are likely to emerge in the context of power management and clock domains. Closing this one.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

rosterloh picture rosterloh  路  4Comments

pdunaj picture pdunaj  路  3Comments

ghost picture ghost  路  3Comments

skylayer picture skylayer  路  4Comments

ghost picture ghost  路  4Comments