Cwa-documentation: [SECURITY] Tampering with Bluetooth metadata

Created on 12 Jun 2020 · 30Comments · Source: corona-warn-app/cwa-documentation

In a recent report to the Swiss CSIRT, EPFL professor @vaudenay and Martin Vuagnoux point out that the Bluetooth metadata containing the transmission power is merely encrypted with AES-CRT, and not authenticated (cf Section 3.5 here). This makes tampering with parts of the metadata possible, and highly increases the chances of success of some forms of replay attacks, particularly in conjunction with the validity of RPI for 2 hours (see Section 3.6).

Screenshot 2020-06-12 at 03 05 54

This makes the entire system vulnerable to false positive attacks, with the attack surface growing much faster than the utility of the system.

I have been told by some in the periphery of the German community working on Corona Warn that this is a flaw well known here, but have not been able to gain confirmation. Has it been documented anywhere?

security

Source

pdehaye

👍2

Most helpful comment

Please re-open this issue, until this attack is documented.

pdehaye on 19 Jun 2020

👍2

All 30 comments

Thanks for the report. Our security experts will have a look at that.

Mit freundlichen Grüßen/Best regards,
SW
Corona Warn-App Open Source Team

SebastianWolf-SAP on 12 Jun 2020

Good morning @pdehaye,

thanks for reaching out.
As described in our mobile architecture documents and especially the architecture diagrams (Android/iOS) we do not manage the Bluetooth transmission and/or reception and therefore do have less influence on both scenarios you mentioned above. The Bluetooth management and related activities are handled by the mobile operating systems and their corresponding Exposure Notification Framework.

Furthermore, an attack precondition to create false positives is to be able to tamper upload authorizations to spread a false notification status at the end.

Nonetheless we highly appreciate your remarks and will further discuss them with the mobile operating system providers.

Happy Friday
@haxxbard

haxxbard on 12 Jun 2020

👍2

Furthermore, an attack precondition to create false positives is to be able to tamper upload authorizations to spread a false notification status at the end.

This is wrong. An attack precondition is to collect Rolling Proximity Identifiers whose Temporary Exposure Keys will later be (correctly) revealed. One scenario is thus harvesting of Bluetooth beacons near a hospital at 7.30am, tampering with them, and then replaying them in the middle of the financial district. Since it is deployable on consumer phones, smartphones (even of non-participants) or IoT devices are part of the attack surface through hacking.

therefore do have less influence on both scenarios you mentioned above
Have you considered reaching out to other European projects and discussing how the API could be changed?

Please re-open the issue, or at least point me to where this vulnerability could be reported while CoronaWarn undergoes a security review later.

pdehaye on 12 Jun 2020

Regarding 3.5 "Unauthenticated Metadata" - rather than flipping bits in unknown values (which has a decent chance that you are actually increasing a value that you wanted to decrease), the attacker would be better off simply replaying the message with a much higher output power. I don't think this aspect is a real weakness in the protocol.

mh- on 12 Jun 2020

@mh- I am aware of this, but it really depends on which type of attacker you are considering. A hacker leveraging someone else's device might not have that opportunity for instance. An attacker leveraging someone else's SDK might not have that opportunity either. This affects scale, likelihood, cost, motive, etc of the attacker, which in turn affcet legal questions around the legal status of this data (as personal data or not), depending on the country.

Additionally you would detect this differently in the two attack modes: one on signal strength abnormally high, one on multiple payloads that are very similar.

Finally, depending on undocumented parts of the protocol, I suspect in dynamic environments you would be able to detect which payload was emitted at higher strength, due to timing and physical characteristics of the signal.

pdehaye on 12 Jun 2020

Reviewing the whole thread, please make sure to include this tampering attack also in your documentation, as it has downstream security and legal effects, especially if you care about interoperability. The trade-offs will not be the same everywhere you are hoping to interoperate with.

pdehaye on 12 Jun 2020

@pdehaye Again, the observation that one can flip bits in the plaintext with CTR mode is correct, but it helps only if you know the original value, or you can make reasonable assumptions about it. This significantly reduces the relevance of this attack vector.

Also, your statements are very broad, e.g. "downstream security and legal effects" - if you wish to engage in a discussion, it might help to be more specific.

mh- on 12 Jun 2020

As already stated by @haxxbard, we do not manage the Bluetooth transmission and/or reception. Please address these concerns directly to Apple and Google, they have dedicated processes especially with respect to security.

In case you find out a security vulnerability in one of the CWA components, please see the respective SECURITY.md file in the respective repositories, e.g. https://github.com/corona-warn-app/cwa-app-ios/blob/development/SECURITY.md for the iOS client.

Mit freundlichen Grüßen/Best regards,
SW
Corona Warn-App Open Source Team

SebastianWolf-SAP on 12 Jun 2020

@mh- The downstream security and legal effects refer to consequences for the legality analysis of the system deployed. The app developers can chose to ignore problems "lower down the stack", at the protocol or Bluetooth layer, but at some point someone in Germany will have to deploy it and evaluate the legality of the app (who?), which will have to include considerations of the integrity of the data, usefulness of the signals, classification as a medical device, consideration as personal data, etc. This is highly dependent on the country, so I can't be more specific for Germany, but in Switzerland these considerations are biting back into the app developers as well, in terms of UI for instance: "Exchange Bluetooth beacons anonymously" will probably not fly anymore. I can offer more specifics if you are interested, but some will be dependent on the fact that despite having similar data protection laws, we don't have the same jurisprudence built on top. Some are also still question of political debate, for instance around the epidemiological interventions following a SwissCovid notification of risk: if the Swiss confederation wants to deploy an app that it knows is at risk of false positives, and it wants such a notification to give a right to a free COVID-test, then maybe the cantons shouldn't be paying for these tests.

@SebastianWolf-SAP The core problem as I see it is precisely to think in terms of a layered stack: it is not so. Everyone wishes we could build on top of a nicely defined protocol, but of course the interactions are very open since there can be all kinds of eavesdroppers etc. If you forget one such angle of attack (here direct tampering with beacon signals), you might need to revisit more than you think. If you know of one such angle of attack and don't document it, you might be assuming more liability than you anticipate (especially if you see it as a problem with the components you are using, but decide it is not your problem to get those components fixed).

My main point here is that while the signal strength vs message tampering question is irrelevant to mitigation measures at the software layer (*), it will have consequences further when the app is analyzed. Basing this analysis on partial documentation of the threats might have consequences later in validating the app.

(*): unless we go into a surveillance countermeasures route, or maybe the protocol takes advantage of the three bytes that remain available in the metadata to include some sanity checks

pdehaye on 13 Jun 2020

👀1 👍1

@mh- I have been privately told of techniques that increase the chances of an attacker at flipping values for the AEM. They convinced me, but I will not disclose them at the moment because this is still the topic of ongoing research.

pdehaye on 19 Jun 2020

@SebastianWolf-SAP to illustrate the fact that this metadata tampering is not documented well enough, see the responses to this thread:
https://github.com/corona-warn-app/cwa-documentation/issues/306#issuecomment-646373297

pdehaye on 19 Jun 2020

In joint work with Joel Reardon (University of Calgary) (see also #308 ), we found a new attack that we consider linked to this one, since an attacker would leverage the dematerialization the AEM tampering brings to bring them scale (an attacker would no longer need hardware, and could just attack through software).

This new attack is now #308 (combined with the process issue this very thread has highlighted).

pdehaye on 19 Jun 2020

Please re-open this issue, until this attack is documented.

pdehaye on 19 Jun 2020

👍2

@pdehaye mentioned the "the core problem" being "to think in terms of a layered stack".

I had similar thoughts when I read about how a question regarding the impact of a fundamental Bluetooth vulnerability was dealt with a while ago, issues #99 #100 #101 and #160.

corneliusroemer on 19 Jun 2020

@pdehaye because you insist:

I tend to think that your "catastrophic failure" is academic only:

a) Flipping bits in (unknown) plain text

You repeatedly mentioned that AEM is not authenticated, and therefore I could - within a re-play attack - flip bits of the "TX Power" byte, which is encoded as a signed 8 bit integer. I would want to decrease the plaintext value, because I don't have a fleet of high-powered transmitters, and if I succeed, the receiver would later think that the attenuation was lower than in reality --> the estimated distance was shorter than in reality.
Now I have these options:

flip bit 10000000 - this would change the sign of the value; since the original value is likely negative, now I have a much larger value - not desired.
flip bit 01000000 - this would either decrease the value by 64, or increase the value by 64, but I don't know what it will do, since I don't know the original value
flip bit 00100000 - this would either decrease the value by 32, or increase the value by 32, but I don't know what it will do, since I don't know the original value
flip bit 00010000 - this would either decrease the value by 16, or increase the value by 16, but I don't know what it will do, since I don't know the original value
flip bit 00001000 - this would either decrease the value by 8, or increase the value by 8, but I don't know what it will do, since I don't know the original value
flip bit 00000100 - this would either decrease the value by 4, or increase the value by 4, but I don't know what it will do, since I don't know the original value
flip bit 00000010 - this would either decrease the value by 2, or increase the value by 2, but I don't know what it will do, since I don't know the original value
flip bit 00000001 - this would either decrease the value by 1, or increase the value by 1, but I don't know what it will do, since I don't know the original value

So which bits should I flip? I'm somewhat afraid that if I decrease the "TX power" value below the received RSSI, the receiver might actually detect my manipulation and simply discard my modified packet.

mh- on 19 Jun 2020

Of course I will insist!

You presented 8 options. Of course that's not all the options an attacker would have. They actually have 256, since one can combine the 8 primary ones.

It's not just that I don't know the value. I also don't know:

if the attenuation value is a constant. If you see that documented somewhere, please tell me. If it is variable there are clear ways to figure out relative information about the values involved (which is higher than the other).
if Apple/Google discard nonsensical reads (negative attenuations); if you see it documented somewhere that Apple and Google would engage in detection of manipulation, please tell me.
what is the likely distribution of values within their range of -127 to 127. It seems likely they would be unevenly distributed, if it is tied to calibration of devices actually circulating in a certain geographic region. The values could be rounded off as well, for privacy reasons. if you see these values documented somewhere, please tell me.
how the values will be averaged over the duration of an epidemiologically relevant contact. The iOS doc talks about AttentuationValues, but last time I checked it was fairly confusing. I also have no guarantee this is the same for Apple and Google. If you see this documented, please tell me.

You also seem to forget an attacker doesn't exactly need the initial value, they just need the result of a calculation tied to distance and attenuation to end up in the right "box" for the GAEN risk calculation (this is dependent on how exactly the API is leveraged by the app).

In the attack scenarios described, the attacker also doesn't need specific people to be "caught", just a larger number than when the system works nominally, or even just a different set. This value being tampered can be thought of R such that all the individuals between R and R+2m are "painted". When the system works nominally, we have R=0. An attacker would just observe that an annulus of radii R and R+2m is very likely to have more people in it than the disc of radius 2m. Of course too large an R would not do, due to the actual physical attenuation of the signal. But still, more options than you might think at first.

So, all in all:

I am confident there are many ways to drastically improve the odds one might expect the attacker to have at first - it will never be deterministic but certainly better than a 1/256 chance;
I am aware describing a precise strategy requires information I don't have at the moment;
I am aware the CoronaWarn project committed to full open source and documentation of how the system works, so I am looking forward to eventually dig into the code and get answers to my questions, so I can in turn more precisely answer yours.

pdehaye on 19 Jun 2020

So what could an attacker achieve in the worst case at the application level and how much would they have to spend?

sventuerpe on 19 Jun 2020

You are probably also aware that iOS and the Google Play Services is closed source and the CoronaWarn project will not be able to answer your questions. Why don't you direct them to Apple / Google directly? (And please let us know about the response, I'm very interested in this.)

Also,

If it is variable there are clear ways to figure out relative information about the values involved (which is higher than the other).

please enlighten me.
c1 = x1 XOR y, c2 = x2 XOR y, y is unknown ciphertext of an AES encryption - how do you find out from c1 and c2 whether x1 < x2?

The attenuation value is of course never constant, I guess your question refers to the TX Power value. This might vary with pose detection, i.e. the device could transmit with more power while it estimates that is in a pocket, etc. I'm just speculating here, though.
Values > 0dBm are very unlikely IMO.
My OUKITEL device uses an apparently constant 0xf2 == -14dBm at the moment. Of course I could manipulate this, using the table above. I could make it -36dBm, for example. But what if other devices use -36dBm, then I change their received TX values to -14dBm.

Unless someone presents a list of TX values of devices with a high market penetration, and determines that the values are fixed and will stay like that for the next months, this attack vector stays purely academic.

mh- on 19 Jun 2020

I have directed some questions to Google and Apple already, with no response. I am glad to hear you are interested in this, but can't guarantee I would get back to you on this one. Therefore I would advise you to ask Google and Apple directly as well if you are interested. Personally I care about the response coming from CoronaWarn because I believe those who deploy such apps should be clear also on what parts are obscure to them, and certainly not pretend everything is open source when there are clear software issues that are hidden away.
for figuring out relative values, you actually have the triple Bluetooth channels to take into account as well as temporal evolution and background Bluetooth traffic (_i.e._ of third parties), which the target phone is reacting to. Lots of side-channels!
yes, I meant the TX power, not the attenuation value. It _might_ vary, and indeed it would make sense that it does from a utility perspective. It would be a drain on battery too. Regardless, my question was whether it does vary.
? the attacker doesn't have to change all the values the same way for all devices (but even if the attacker does, a quick calculation shows that the surface of an annulus and an empty disc is likely to be higher than the sum of areas of two disks of radius 2m)
am I understanding correctly that the table you are referring to is already published here? This is actually the result of the work of GSMA and collaboration with other European teams.

pdehaye on 19 Jun 2020

@sventuerpe leveraging the AEM tampering, I came up with Joel Reardon (UCalgary) with a SDK-based attack. See #308. I don't know if this answers your question. The spend to an attacker in the right position would be very small. At application level, the consequences might be:

faster collection of identifiers than anticipated, more notifications etc;
UI to contextualize this risk to users, before enabling the system and when a notification arrives

Certainly for the project building this application, it means: better documentation.

pdehaye on 19 Jun 2020

am I understanding correctly that the table you are referring to is already published here? This is actually the result of the work of GSMA and collaboration with other European teams.

I don't know if Google / Apple are using this table to determine the encrypted TX Value.
tx_RSS_correction_factor seems like a candidate that is related, but since I do not find my device in the table, and it sends "0xf2" == -14dBm, but I also don't find -14 in the table e.g. as the default value, I would say:
No, that is not a table an attacker could use to devise useful attacks on flipping TX Power bits.

mh- on 19 Jun 2020

Then I think the data referred in this tweet by a member of the Swiss team will be it.

pdehaye on 19 Jun 2020

@pdehaye I am afraid I do not follow. #308 points back here without adding much detail. What would be the most realistic high-level yet specific description of an attack and its consequences?

sventuerpe on 19 Jun 2020

@sventuerpe Apologies. In #308 there is a link to a paper explaining how the AEM tampering can be leveraged to conduct population-scale re-identification and/or false positive attacks. We called it the SDK attack, and I started a new issue given that this one had been flagged as out of scope and closed.

pdehaye on 19 Jun 2020

@pdehaye

@sventuerpe how the AEM tampering can be leveraged to conduct population-scale re-identification and/or false positive attacks.

Sorry, I must have missed that. How do you envision using AEM tampering to conduct population-scale re-identification attacks?

mh- on 19 Jun 2020

@mh- You are correct, I was imprecise. I should have said to @sventuerpe :

The GAEN framework suffers from well-known and documented re-identification and false positive attacks. However these attacks are wrongly thought to either require physical presence of the attacker next to the victim or to be expensive because they require hardware. In #308 we show those assumptions are incorrect. We show that by leveraging either SDKs or particluarly well-permissionned apps, an attacker could conduct population-level re-identification and/or false positive attacks. The case of a false positive attack would require to implement a variation on classic replay attacks, called _AEM tampering_. We additionally show some evidence indicating such a SDK-based attack would be very easy to some actors. Neither the AEM tampering nor the scaling-through-SDKs are documented in official German documentation.

pdehaye on 19 Jun 2020

👍1

@pdehaye I think

The case of a false positive attack would _require_ to implement a variation on classic replay attacks, called AEM tampering.

is also imprecise, it is not _required_.
You can implement a relay attack if you can build your bot net, using the maximum available TX power of each device while advertising the relayed beacons, without tampering with AEM; you just hope to further increase the effectiveness by trying to modify the AEM.

mh- on 19 Jun 2020

You are right, we didn't formally show it was required. I highly suspect it would make it easier and is in fact required, but can't demonstrate that without:

more transparency in the Bluetooth layer;
more data on calibration (beyond averages of the kind published here)

Also, a SDK might not have/offer the option of modifying the maximum TX power.

pdehaye on 19 Jun 2020

For the record, because this might be relevant to #322: while the tampering with AEM values has to be done blindly, one can use all the binary triples (say) abc to XOR the actual AEM with abc11111, and thereby generate a predictable set of values: the initial value x plus all its shifts by 32, modulo 256 so that it remains in [-127,127].

Example:

If the real metadata value is 10, which I don't know because it is encrypted, I can nevertheless generate the encrypted values of 10 and 42, 74, 106, -22, -54, -86, -118.

pdehaye on 21 Jun 2020