We observe an Android master issuing an L2CAP Connection Request, granting us (as peripheral) ~65000 credits. That results in a call to l2cap_chan_tx_give_credits() that takes a long time to return and causes connection time out and 100% CPU load for whole duration (>4 seconds for us).
Credits are implemented using a counting semphore, but the BT Host code is currently only able to increment it by 1 -- leading to 65000 calls to k_sem_give().
Relevant snippet of l2cap_chan_tx_give_credits():
while (credits--) {
k_sem_give(&chan->tx.credits);
}
Similar issue exists for l2cap_chan_rx_give_credits().
Potential solutions:
give_init taking the initial resource-count as argument. To just be able to initialize the resource count in O(1) time, rather than O(n). Or can we use k_sem_init() for this?Giving many should be doable, I guess the only extra work is to waking up logic since that could mean more than one thread could be awaken in the process.
After a quick chat it seems it would be fine to use k_sem_init and only do l2cap_chan_tx_give_credits for restoring credits which should restore less credits in a row, still that would benefit from having an alternative to k_sem_give.
@andrewboie It might be possible to do something like:
https://github.com/zephyrproject-rtos/zephyr/blob/master/kernel/queue.c#L232
So we unpend the number of threads based on the counter, prepare the threads to run, assign the new counter and finally reschedule.
What's the specific proposal here?
cc @andyross
@andrewboie @andyross I think the proposal is to add a new API like k_sem_give_many(struct k_sem *sem, unsigned int count); (exact name can be debated) which would act the same as calling k_sem_give(sem); count number of times. Scheduling-wise this should be fairly similar to k_fifo_put_slist()
taking a look at this, this week
@andrewboie did you manage to look into this? Let me know, otherwise I could give it a shot.
My inclination at this time is to not change the semaphore APIs, I am not finding precedent for this in other OSes.
Go through Host code and replace the credit count by something adhoc (checking a counter, then using a binary semphore), rather than a counting semaphore.
I think you should do this instead.
My inclination at this time is to not change the semaphore APIs, I am not finding precedent for this in other OSes.
@andrewboie lack of precedent doesn't seem like a particularly strong argument to me (plus our k_fifo_put_slist API is kind of a precedent since it seems analogous to what's requested here). Do you see any technical reasons why we shouldn't be doing it? A semaphore is essentially a counter, and counting credits is what's needed here. Using a semaphore, but not its counting ability, thereby requiring a separate integer counter, feels a bit suboptimal to me.
@andrewboie lack of precedent doesn't seem like a particularly strong argument
It does to me.
We should not be inventing new APIs for traditional operating system APIs when the traditional semantics of OS semaphores have served the world well for many, many, years. It's bad enough that k_sem takes a nonstandard 'limit' parameter (although there are at least several OSes that implement semaphore limits)
NACK on changing k_sem, sorry.
FYI, I have started implementing option 3 as requested.
Most helpful comment
@andrewboie @andyross I think the proposal is to add a new API like
k_sem_give_many(struct k_sem *sem, unsigned int count);(exact name can be debated) which would act the same as callingk_sem_give(sem);countnumber of times. Scheduling-wise this should be fairly similar tok_fifo_put_slist()