Zephyr: Low UART utilization for hci_uart

Created on 2 Sep 2020  路  14Comments  路  Source: zephyrproject-rtos/zephyr

Hi, Zephyr community,

Introduction
I am aiming to develop a high-throughput BLE application using the hci_uart example running on an NRF52810 chip (hosted on a custom board).

The problem
I am trying to pump as much data through the UART channel in a way that I wait for an ACK (number of packets completed) before sending another packet from the host to the controller. The issue is that I get a low amount of ACKs within one connection interval. I first fill the available TX buffers (12 of them) with data, and then start waiting for ACKs before sending any more.

You can see in this image, that once I start receiving ACKs, the connection interval utilization is fair, but after a while, the number of ACKs per connection interval drops down to 2-3. This number of ACKs is not related to the connection interval, length, as you can see in this image, Im using a longer connection interval, and the UART utilization drops in the same manner.

Is this something obvious that I'm misunderstanding? Hope you can help.

With regards,
James

Additional notes

  • The number of ACKs received is the same for all connection intervals
  • Im using 2M PHY and DLE
  • NRF sniffer shows that the configurations are applied correctly
  • TX buffers for the controller is 12
  • All data packets are received on the other end correctly

Setup

  • NRF52810 running on a custom board (hci_uart Zephyr 2.3.99 on the controller & custom BT driver as the Host)
  • Receiving end is an Android 10 device
Bluetooth bug nRF low

All 14 comments

This should be fixed by https://github.com/zephyrproject-rtos/zephyr/pull/27918, which will be merged shortly, and https://github.com/zephyrproject-rtos/zephyr/pull/27917, which is already merged. What SHA are you testing with?

Adding @cvinayak in case this is actually an issue in the Link Layer.

NRF52810 running on a custom board (hci_uart Zephyr 2.3.99 on the controller & custom BT driver as the Host)

Can you specify the exact revision you are using? Please try using master after #27918 is merged

Also @jjamesson you can drag-and-drop images directly in the GitHub issue so that they are displayed inline, which would make it easier to understand your statements in context with the images.

Thanks for the reply, @carlescufi !

This should be fixed by #27918, which will be merged shortly, and #27917, which is already merged. What SHA are you testing with?

Good to know! I'm currently using SHA 13cf241ee61524ceeab0a70a8b623bef145e5c5e.

With regards,
James

Hi, @carlescufi,

I see that the the commit is now in the master branch, so I tried running this, wereas the results are similar. Thus, my problem is not solved by this commit. See the picture after this post.

I am using SHA d3e8d6c3f36453fc455c943482afce9cf155ead8, and I did a complete reinstall to Zephyr according to the Getting Started Guide. This includes using the latest SDK version (v11.4).

Maybe there is something else wrong?

With regards,
James

osc3

Hi,

@carlescufi , is anyone looking into this issue?

I have tested this with a completely different setup (NRF52 dev kit with BlueZ + NRF52 Usb Dongle with BlueZ). The issue persists, that when I expand the connection interval, the number of ACKs I get stays the same, and thus the throughput decreases.

Any insights would be appreciated.

With regards,
James

@jjamesson we are looking into this right now. Do you have a logic analyzer trace of the UART traffic?
And does the issue only manifest itself when the bulk of the traffic is in the direction Host -> Controller, but with Controller -> Host the throughput is acceptable?

Hi, @carlescufi,

Unfortunately the oscilloscope I'm using doesn't have a logic analyzer, not do I have one separately around.

As for the second question, I tried pumping data from the Android's side, and I think the RX part (controller -> host) is fine. Even though the Android or the controller is chopping up the packets to 27B, the throughput dropped from 454 kbps to 375 kbps when changing the connection interval from 6 (7.5ms) to 31 (46.5ms). Where as when sending data (host -> controller), the throughput drop for the same connection intervals is 413 kbps -> 99 kbps, which is way more dramatic.

Below is an example of the RX path, when using a connection interval of 31. Although I'm not using ACKs, the UART utilization is much better than in the TX path I showed in my previous answer.

Hope this provides some help,
James

osc4

As for the second question, I tried pumping data from the Android's side, and I think the RX part (controller -> host) is fine. Even though the Android or the controller is chopping up the packets to 27B, the throughput dropped from 454 kbps to 375 kbps when changing the connection interval from 6 (7.5ms) to 31 (46.5ms). Where as when sending data (host -> controller), the throughput drop for the same connection intervals is 413 kbps -> 99 kbps, which is way more dramatic.

OK, in this case this looks like the same issue we are chasing. We are currently investigating, and will update you here. Also, I will turn this issue into a bug since this is a clear problem in the hci_uart build.

@cvinayak @nordic-krch FYI, see thread above.

Thanks, looking forward to any updates.

NB. Has anyone investigated high throughput applications with hci_spi? I'm wondering if this might have similar problems since I was planning to use this in the future for getting even higher throughputs.

With regards,
James

I have identified a design flaw around the implementation here: https://github.com/zephyrproject-rtos/zephyr/blob/3e7db661d15fb5ce264f8ce63a948e0fe1152631/subsys/bluetooth/controller/ll_sw/ull.c#L1544

The de-muxing of the tx packets in ULL and subsequent enqueue towards LLL is not run for the new tx packets arriving while within a connection event.

The design had tried to optimize CPU use by not calling LLL enqueue if there isnt any num complete given out. But this design is flawed that it is not necessary that a num complete needs to be given out, any previously given num complete could send in new packets to tx.

I will send a PR in the coming days to address this.

Thanks, looking forward to any updates.

NB. Has anyone investigated high throughput applications with hci_spi? I'm wondering if this might have similar problems since I was planning to use this in the future for getting even higher throughputs.

I very much doubt it. The hci_spi application is not widely used. But I don't foresee any major trouble unless there's been a regression lately.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

skylayer picture skylayer  路  4Comments

JusbeR picture JusbeR  路  5Comments

rosterloh picture rosterloh  路  4Comments

KwonTae-young picture KwonTae-young  路  5Comments

akansal1 picture akansal1  路  4Comments