Esp-idf: Bluetooth Classic data curruption (IDFGH-418)

Created on 19 Oct 2018  路  3Comments  路  Source: espressif/esp-idf

Environment

  • Development Kit: [none]
  • Core (if using chip or module): [ESP32]
  • IDF version f11ac037c4958fc68e67a50134a0ce9a1634008c
  • Development Env: [Make]
  • Operating System: [Windows]
  • Power Supply: [USB]

Problem Description

When running Bluetooth classic SPP we get currupt packets. Random data in packets are wrong, the correct amount of data is sent/received but some packets contain a few bytes which are changed.

In our application we see at least 1-2 packets with wrong data every 1-2GB duplex data. I did some investigation and I can very easily reproduce the problem if I introduce some dampening by using an attenuator between the two ESP32's. By setting the damping to a number which causes the throughput to go down by ~50% I see the packet error after just 1-5 minutes of data TX.

The issue is also reproduceable with the bt_spp_initiator and bt_spp_acceptor. We just added a checksum in the beginning of each packet and in that way we can easily see if the packet was received with an error.

I have tested using Enhanced Retransmission Mode using Bluekitchen stack, and still see the problem. ERTM should catch if there was an error in the air which the 16bit Bluetooth Classic CRC missed. Meaning (assuming ERTM works in Bluekitchen stack) that the error is somewhere in Espressif code. So either the packet is currupted before beeing sent (after leavinig the stack to the BT controller) or that it's currupted after it's received and before it's passed to the stack.

As I have reproduced the problem with two different stacks, it's not a stack issue. The data is already currupt when the stack on the receiving side gets it.

My guess from my finding is that there is some issue when there are retransmitions of packets, on receiving side.

Additionally I have placed a sniffer (see image at the end) between the master and the attenuator, meaning I can see exactly what data the master sends. When the slave receives a packet with currupt data I can see all sent packets by the master (including all retransmittions) and I can not see any packet which is invalid here. Which also suggests that the issue is on receiving side. (Assuming ERTM works)

Another issue I see is that the ESP32 does not change packet type when the link is bad. For example casusing a very bad link, the ESP32 still does use 3-DH3 packets, but I think it should switch down to DM packets when the link is really bad.

Expected Behavior

No currupted data received.

Actual Behavior

Currupted data received.

Steps to repropduce

Problem is reproduceable without intruducing damping, however much easier if you do. Also the environment in our office is pretty harsh, so possibly it's not possible for you to reproduce without damping (casing retransmitions). Maybe moving two units away from eachother far enough to affect the throughput -50% would also work. I have not tested.

  1. Connect two ESP32 with damping between them which cause the max throughput to go down 50+%
  2. Run modified bt_spp_initiator and bt_spp_acceptor which also checks for packet errors.
  3. After a few minutes you should see that currupt data is received.

Setup looks like this with two ESP32 with external antenna connected to an attenuator:
capture

Question to Espressif

Does Bluedroid support Enhanced Retransmission Mode? If so I could verify with another stack that this is not an issue in the air. By searching in Bluedroid code I find ERTM implementation, how do I enable it?

I'm doing some more investigation, so the issue may update.

Logs

Modified spp exampes including CRC check and printout of invalid packet:
Please ignore the first packet error received stright after start.
spp_error_examples.zip

Damped until throughput was ~900kbps.
Error received after about 2 minutes. I have highlighted the incorrect bytes (d2 97) below in bold.

Example error packet from bt_spp_acceptor

db dc dd de df e0 e1 e2 e3 e4 e5 e6 e7 e8 e9 ea eb ec ed ee ef f0 f1 f2 f3 f4 f5 f6 f7 f8 f9 fa fb fc fd fe 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f 30 31 32 33 34 35 36 37 38 39 3a 3b 3c 3d 3e 3f 40 41 42 43 44 45 46 47 48 49 4a 4b 4c 4d 4e 4f 50 51 52 53 54 55 56 57 58 59 5a 5b 5c 5d 5e 5f 60 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 75 76 77 78 79 7a 7b 7c 7d 7e 7f 80 81 82 83 84 85 86 87 88 89 8a 8b 8c 8d 8e 8f 90 91 92 93 94 95 96 97 98 99 9a 9b 9c 9d 9e 9f a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 aa ab ac ad ae af b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 ba bb bc bd be bf c0 c1 c2 c3 c4 c5 c6 c7 c8 c9 ca cb cc cd ce cf d0 d1 d2 d3 d4 d5 d6 d7 d8 d9 da db dc dd de df e0 e1 e2 e3 e4 e5 e6 e7 e8 e9 ea eb ec ed ee ef f0 f1 f2 f3 f4 f5 f6 f7 f8 f9 fa fb fc fd fe 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f 30 31 32 33 34 35 36 37 38 39 3a 3b 3c 3d 3e 3f 40 41 42 43 44 45 46 47 48 49 4a 4b 4c 4d 4e 4f 50 51 52 53 54 55 56 57 58 59 5a 5b 5c 5d 5e 5f 60 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 75 76 77 78 79 7a 7b 7c 7d 7e 7f 80 81 82 83 84 85 86 87 88 89 8a 8b 8c 8d 8e 8f 90 91 92 93 94 95 96 97 98 99 9a 9b 9c 9d 9e 9f a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 aa ab ac ad ae af b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 ba bb bc bd be bf c0 c1 c2 c3 c4 c5 c6 c7 c8 c9 ca cb cc cd ce cf d0 d1 d2 d3 d4 d5 d6 d7 d8 d9 da db dc dd de df e0 e1 e2 e3 e4 e5 e6 e7 e8 e9 ea eb ec ed ee ef f0 f1 f2 f3 f4 f5 f6 f7 f8 f9 fa fb fc fd fe 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f 30 31 32 33 34 35 36 37 38 39 3a 3b 3c 3d 3e 3f 40 41 42 43 44 45 46 47 48 49 4a 4b 4c 4d 4e 4f 50 51 52 53 54 55 56 57 58 59 5a 5b 5c 5d 5e 5f 60 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 75 76 77 78 79 7a 7b 7c 7d 7e 7f 80 81 82 83 84 85 86 87 88 89 8a 8b 8c 8d 8e 8f 90 **__91 d2 97 94__** 95 96 97 98 99 9a 9b 9c 9d 9e 9f a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 aa ab ac ad ae af b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 ba bb bc bd be bf c0 c1 c2 c3 c4 c5 c6 c7 c8 c9 ca cb cc cd ce cf d0 d1 d2 d3 d4 d5 d6 d7 d8 d9 da db dc dd de df e0 e1 e2 e3 e4 e5 e6 e7 e8 e9 ea eb ec ed ee ef f0 f1 f2 f3 f4 f5 f6 f7 f8 f9 fa fb fc fd fe 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21 22 b3 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f 30 31 32 33 34 35 36 37 38 39 3a 3b 3c 3d 3e 3f 40 41 42 43 44 45 46 47 48 49 4a 4b 4c 4d 4e 4f 50 51 52 53 54 55 56 57 58 59 5a 5b 5c 5d 5e 5f 60 61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 72 73 74 75 76 77 78 79 7a 7b 7c 7d 7e 7f 80 81 82 83 84 85 86 87 88 89 8a 8b 8c 8d 8e 8f 90 91 92 93 94 95 96 97 98 99 9a 9b 9c 9d 9e 9f a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 aa ab ac ad ae af b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 ba 54 79

Edit 1: I have added modifed spp example code and log of invalid packet above.

Most helpful comment

Bluetooth Link Layers uses CRC-CCITT (Core v5, vol 2, part b, 7.1.2), while L2CAP ERTM FCS uses CRC-16.

In your example, you seems to have two bit errors: 92 -> d2 (sets bit 6) and 93->97 (sets bit 2).

Browsing around, http://www0.cs.ucl.ac.uk/staff/electran/2011/pdf/2011-07.pdf states that both CRC-16 & CRC-CCITT detect all single bit and two bit errors. So, both should have rejected this packet.

Congrats on finding a quick way to reproduce this quickly.

All 3 comments

Bluetooth Link Layers uses CRC-CCITT (Core v5, vol 2, part b, 7.1.2), while L2CAP ERTM FCS uses CRC-16.

In your example, you seems to have two bit errors: 92 -> d2 (sets bit 6) and 93->97 (sets bit 2).

Browsing around, http://www0.cs.ucl.ac.uk/staff/electran/2011/pdf/2011-07.pdf states that both CRC-16 & CRC-CCITT detect all single bit and two bit errors. So, both should have rejected this packet.

Congrats on finding a quick way to reproduce this quickly.

Hi @jakkra
We do not support Enhanced Retransmission Mode, now.
And the error may caused by some bit error which CRC-CCITT and CRC-16 cannot detected.

Minor update on ERTM here. ERTM can optionally use a frame checksum. With ERTM FCS enabled in BTstack, the combined Link Layer & L2CAP CRC-16 seems to be sufficient to detect all bit errors (resp. @jakkra wasn't able to get another false packet).

Was this page helpful?
0 / 5 - 0 ratings