The purpose is similar to what's described in #213 but by slightly different means.
The goal is to detect more of the anonymous cluster situations in which it may be likely that infections actually occurred (vs. just meeting many people as in #213) *even if the person which was the source of the infection has not used CWA.
The goal is to discover source clusters using exactly the same mechanism as potential risky encounters. Currently, when someone is tested positive, risk values are published for days based on the day that symptoms started.
Now, in addition to that, another set of TEKs should be published that specify the likelihood that the positive-tested person was infected themselves on a given day (based on current scientific consensus about incubation time for covid-19).
Now any person taking part in CWA, can figure out if they met someone at a place/time where this person might have got infected. For a single infected person this information is completely useless: it's just an encounter with a person which later got symptoms and was tested positive. However, if you find a time slot where you met multiple persons that were later tested positive and have likely got infected at that time slot you might have detected a cluster. Based on the timestamp you might be able to remember what kind of situation you were in during that time.
The question is what to do with that information?
Potential technical issues:
Internal Tracking ID: EXPOSUREAPP-4678
This is very promising! I have been banging my head since May on how to do this. I think you found the perfect "time-reversal" description of the problem that makes GAEN so much more useful for backwards contact tracing! In hindsight it was so simple...
You can do both forms of risk scoring simultaneously (two API calls), and they each could lead to different outcomes. In forward contact tracing you want a narrow net, and 15 minutes. But for the backward contact tracing and the cluster busting you want the risk scoring to be done with a broader Bluetooth net (attenuation doesn't really matter) but with longer duration, and the result could be "we don't think you are necessarily infected, but please call hotline B so we can ask you where you were" (hotline A is for those at risk of having been infected, triggered via forward contact tracing). The public authority can set the risk threshold according to their ability to investigate backwards, and the returns they see on this mode of investigation (which is subtle).
Some very early great work on modeling the utility of those apps comes directly handy. Unlike predictive models 脿 la Imperial, exact models such as here and particularly here have the direct advantage that they can be repurposed for this type of new configurations. Will think about it more!
I am hoping @heinezen will also add it for consideration by the CWA team.
@pdehaye thanks for your assessment, I agree that this sounds promising. This could also be combined with #213 (which would require some ENF modification by Google/Apple) to allow for automatic detection of some of the potential high density/cluster events.
Also: I don't think a second set of TEKs would be needed. The info on potential infection dates could simply be encoded in a dedicated TRL state (in legacy v1 mode) or in a specific reportType (in ENF v1.5+ mode) for the TEKs which are representing the probable infection day.
Together with #213 you could also e.g. have a system where devices which observed high density events and get a matching TEK with the special reportType for that day could automatically be asked to self isolate even though the TEK itself wouldn't indicate high levels of infectiousness, just that someone who they met at a high density/cluster event that day probably got infected there, which makes it much more likely that they got infected there as well.
Thinking more about this you could even trigger some automatic warning without #213: let's say your device recorded rolling proximity identifiers (RPIs) which were generated from e.g. two or three different temporary exposure keys (TEKs) marked with the special reportType (which indicates a potential infection day) around the same timeframe you could use this as an indicator for a cluster infection and trigger a warning on this device together with the request to call "hotline B" to inform the contact tracing agency.
I didn't mean that a second set of TEKs would be needed, but that a second evaluation function would be needed.
@pdehaye yes agreed. I was referring to OP writing
Now, in addition to that, another set of TEKs should be published that specify the likelihood that the positive-tested person was infected themselves on a given day (based on current scientific consensus about incubation time for covid-19).
when I was saying
I don't think a second set of TEKs would be needed.
I wasn't sure if I got my point across or if the idea had any merit, so thanks for the feedback. :)
Now, in addition to that, another set of TEKs should be published that specify the likelihood that the positive-tested person was infected themselves on a given day (based on current scientific consensus about incubation time for covid-19).
I don't think a second set of TEKs would be needed.
I guess I wasn't clear enough about that. I didn't mean to say that there should be another set of TEKs announced over BLE, but that the TEKs spanning the cluster detection timespan would be different (but maybe slightly overlapping) from the ones spanning the existing contact risk ones, and would have to be selected differently for submitting (= "publishing") to the server. Or maybe I got something wrong about the protocol and an infected person will always publish all their own TEKs to the server anyway (with very low risk profile for the early days of the 14-day interval)?
@jrudolph thanks for your suggestion, getting CWA (and ENF) ready for some backward contact tracing / cluster detection sounds like a promising way forward with this virus whose spread is characterized by overdispersion.
I didn't mean to say that there should be another set of TEKs announced over BLE
Just a small remark on that: TEKs are not broadcasted by devices, rather rolling proximity identifiers (RPIs) which are derived from TEKs will be broadcasted in the form of BLE beacons (cf. here).
Or maybe I got something wrong about the protocol and an infected person will always publish all their own TEKs to the server anyway (with very low risk profile for the early days of the 14-day interval)?
That is indeed correct: an infected person will always upload all of the TEKs which are available on their device i.e. up to 14 (one for each of the last 14 days) and each TEK gets a transmission risk level (TRL) assigned with the oldest TEKs having the lowest TRL: https://github.com/corona-warn-app/cwa-documentation/blob/master/transmission_risk.pdf.
[EDIT: Added some edits for more precision]
That is indeed correct: an infected person will always upload all of the TEKs which are available on their device i.e. up to 14 (one for each of the last 14 days) and each TEK gets a transmission risk level (TRL) assigned with the oldest TEKs having the lowest TRL: https://github.com/corona-warn-app/cwa-documentation/blob/master/transmission_risk.pdf.
This is true in Germany, but not in Switzerland for instance. We have a specific law [actually: a law and an ordinance] that says only keys from S-2 or T+0 have to be uploaded, where S is the date of symptoms for symptomatics and T the date of tests for asymptomatics. We also don't use the [advanced] risk scoring[, i.e. only use the threshold binning].
We can see that Switzerland is trying to do [digital] contact tracing only [forward], while Germany is doing some hybrid as it also could occur that the infector is notified via the infectee.
@pdehaye interesting, I didn't know that CH handles this differently.
We have a specific law that says only keys from S-2 or T+0 have to be uploaded, where S is the date of symptoms for symptomatics and T the date of tests for asymptomatics.
Wow that sounds awfully restrictive even for pure forward tracing. Have they taken the He et al. correction into account when deciding on those cutoffs?
We can see that Switzerland is trying to do forward contact tracing only, while Germany is doing some hybrid as it also could occur that the infector is notified via the infectee.
Tbh: I don't think the CWA team implemented this with the intention of "backward tracing" in mind. This was probably just the most straightforward way to implement diagnosis key (DK)/TEK upload and the (very crude) possibility of backward tracing is just an unintended byproduct 馃槈.
A good thing in this context is that CWA at least doesn't hide encounters which are below the minimum risk score (my understanding is that SwissCovid hides those) instead they will be shown as green/low-risk encounters. This is quite important in this context b/c in a lot of cases the TEKs/DKs for the days where the infection happened will have a TRL of 0 which means no matter how intense the contact was the encounter will always be below the minimum risk score.
So while better than SwissCovid this behavior is o/c far from optimal: for one, green encounters don't trigger a push warning and even the day of the encounter is hidden by CWA in such cases which unfortunately makes the current implementation almost useless for proper backward tracing imho ^^.
This is similar to what we鈥檝e been investigating in the Dutch coronamelder team. We鈥檙e working on a paper that we are planning to share with Apple and Google. Feel free to comment on it at https://docs.google.com/document/d/1blYlQHKHc8o7x6F4dYmAlCZ1E21j6e3z5E0OxmLY0WM/edit
@ijansch thanks for sharing this document, this looks promising and I've left you some comments 馃檪.
One thing which you're not currently mentioning there is whether something like RPI density (cf. this comment above) should also factor in when trying to determine the "source event risk". It's not entirely clear to me whether or not this would actually help with the detection of such source risks, probably some modeling would be required to figure this out.
Another thing which could have nice synergies with this approach is "venue registration" (e.g. #138). If CWA could communicate the timeframe of "venue presence" of the user to ENF, and ENF found a potential "exposed" event during the venue presence that could be taken as further evidence of a cluster event and should probably be communicated to the health authorities and the venue.
Btw: feel free to join us on our community Slack 馃檪.
Most helpful comment
This is similar to what we鈥檝e been investigating in the Dutch coronamelder team. We鈥檙e working on a paper that we are planning to share with Apple and Google. Feel free to comment on it at https://docs.google.com/document/d/1blYlQHKHc8o7x6F4dYmAlCZ1E21j6e3z5E0OxmLY0WM/edit