The "Epidemiological Motivation of the Transmission Risk Level" document describes the transmission risks for infected persons relative to the onset of symptoms. It served as a baseline to define the TRLs. However, the app does not attempt to infer the onset date, and instead, the risk exposure computations use the day of upload as reference day. Any difference between onset date and day of upload thus leads to wrong TRLs being picked for the uploaded keys.
Example: A user shows symptoms on Friday, undergoes a test on Monday, and receives and uploads their positive result on Tuesday. This results in an offset of 4 days. Contacts the user had 6 days before submission (i.e., on Wednesday, two days before symptoms) would receive a low TRL of 3, even though the user was highly contagious at that time.
To solve this issue, the DP-3T consortium recommends to infer the onset date:
If聽the聽test聽result聽is聽positive,聽it聽is聽important聽to聽establish聽when聽the聽patient's聽contagious聽period聽began.
[...] The聽health聽official聽notifies聽the聽patient聽of聽the聽result聽(Step聽4,聽Figure聽CP).聽[...]
We聽assume聽that聽this聽message聽includes聽the聽test聽result聽(positive聽or聽negative)聽and聽if聽positive,聽other聽
supplementary聽information聽such聽as聽a聽request聽to聽contact聽the聽health聽official聽to聽discuss聽
their probable onset date, or advice on how the patient could determine this themselves.
This issue came up during a security analysis of improving privacy against traffic analysis. Kudos to Timo Renner (SAP), Maik Mueller (SAP) and Cas Cremers (CISPA).
Prof. Dr. Christian Rossow | Faculty
CISPA Helmholtz Center for Information Security
Stuhlsatzenhaus 5, Saarland Informatics Campus
66123 Saarbr眉cken, Germany
Mail: lastname [at] cispa [dot] saarland | Web: https://cispa.saarland/group/rossow/_
Internal Tracking ID: EXPOSUREAPP-2192
This issue seems to affect both Android and iOS, so it probably should be moved to the documentation repo :)
Somewhat related: in addition to the problem mentioned here (but somewhat less severe) the current implementation is also applying TRL profiles based on list order, not actual TEK validity: https://github.com/corona-warn-app/cwa-documentation/issues/343
Hello @crossow and @daimpi,
I have informed our development team to take over this issue.
Thanks,
LMM
Corona-Warn-App Open Source Team
I have difficulties understanding the issue. The mentioned "Epidemiological Motivation of the Transmission Risk Level" document sketches a sequence of events (exposure, symptom onset (not always observed), test, information about test result, upload. The TRLs are computed based on upload date as clearly shown in Fig. 14 of the document.
However, the exact transmission score based on the upload day is motivated by epidemiological understanding (State of the information: early June 2020) about the time delay from exposure to upload. As part of this analysis one averages over four different scenarios about possible knowledge about symptom onset in order to match the current situation of the CWA app that NO information about symptom onset is available - only upload date.
The particular scenario involving symptom onset date could be interesting in its own right, because this could be a situation which might happen when local health authorities do contact tracing.
Would it be possible to clarify the issue somewhat?
Thanks for your comments.
@hoehleatsu The issue I see is that (i) the app does not ask the users to enter their onset date (if any), and (ii) the app allows uploads to happen quite late (even after days). The resulting need to estimate these operational delays, as sketched in Section 3.2 of the document, risks that the assumed distributions do not necessarily reflect the situation in practice. The document also acknowledges this and suggests that the distributions "have to be adjusted based on real data once the system is running". I presume this is still to be done.
So my main question: Why doesn't the CWA app follow the suggestion of DP-3T and aims to infer the onset date? This would allow for a far more accurate exposure risk computation.
Thanks for the clarification, that was helpful.
So my main question: Why doesn't the CWA app follow the suggestion of DP-3T and aims to infer the onset date? This would allow for a far more accurate exposure risk computation.
The structure with the four cases shows that this is already been thought of as part of the development process and I would guess that the use of onset date is something on the wishlist - so it's likely resources and priority which decide. The product owners can probably say more about this.
From a scientific viewpoint: One thing which might be helpful is to perform simulations in the spiriti of Ferreti et al (2020), that show how much in terms of
1) what does "onset" for COVID-19 really mean? Here it would be important to have a clear and user-communicable definition which captures that this underlies variability
2) how robust would the scoring be to mis-specifications of the DSO, because it appears to have some variability in cases and could be interpreted by the user in the wrong way. Other apps simply use a TRL constant over a 14 day window in order to "avoid" any assumptions and because sensitivity is prioritised over specificity.
3) are there any data protection issues when asking the user for onset date? Do any of the DSO statements need to be extended accordingly? (it appears that you are an expert on this area)
So my main question: Why doesn't the CWA app follow the suggestion of DP-3T and aims to infer the onset date?
@crossow Why are you using the term "_infer_"? As far as I understand, the "Epidemiological Motivation of the Transmission Risk Level" was already trying to _infer_ the onset date (from the upload date, using assumed statistical distributions).
An improvement would be to _ask_ for the onset date, right?
Or from which other information could the app _infer_ the onset date?
Anyway, this part definitely should be clarified, latest when migrating to API v1.5. The new ExposureWindow mode seems to - at least initially - not make use of days_since_onset_of_symptoms.
@mh- Yes, _asking_ is what I meant. However, as the answer might not always be trivial (e.g., as patients don't know which symptoms count, patients lack symptoms, etc.), I chose another word. Maybe _determine_ would have been more accurate.
In my eyes, ideally, to get rid of modeling inaccuraries, the app should ask (right before uploading the TEKs) _if_ users have symptoms, and _since when_. This question could be made optional to avoid that it deters users from uploading.
@crossow did the recent implementation of "Symptom Recording" which gives positive tested ppl the option to enter a date for symptom onset, address your concerns?
(blog entry)
If so feel free to close this issue 馃檪.
Thanks for the heads-up. Yes, this solves the problem, in particular the computation on the risk level based on onset date.
Good job everyone! Closing.
Most helpful comment
@mh- Yes, _asking_ is what I meant. However, as the answer might not always be trivial (e.g., as patients don't know which symptoms count, patients lack symptoms, etc.), I chose another word. Maybe _determine_ would have been more accurate.
In my eyes, ideally, to get rid of modeling inaccuraries, the app should ask (right before uploading the TEKs) _if_ users have symptoms, and _since when_. This question could be made optional to avoid that it deters users from uploading.