Cwa-wishlist: Source cluster detection helper

Created on 28 Oct 2020 · 12Comments · Source: corona-warn-app/cwa-wishlist

The purpose is similar to what's described in #213 but by slightly different means.

The goal is to detect more of the anonymous cluster situations in which it may be likely that infections actually occurred (vs. just meeting many people as in #213) *even if the person which was the source of the infection has not used CWA.

The goal is to discover source clusters using exactly the same mechanism as potential risky encounters. Currently, when someone is tested positive, risk values are published for days based on the day that symptoms started.

Now, in addition to that, another set of TEKs should be published that specify the likelihood that the positive-tested person was infected themselves on a given day (based on current scientific consensus about incubation time for covid-19).

Now any person taking part in CWA, can figure out if they met someone at a place/time where this person might have got infected. For a single infected person this information is completely useless: it's just an encounter with a person which later got symptoms and was tested positive. However, if you find a time slot where you met multiple persons that were later tested positive and have likely got infected at that time slot you might have detected a cluster. Based on the timestamp you might be able to remember what kind of situation you were in during that time.

The question is what to do with that information?

In the best case, the person being tested positive themselves will find times of encounters with other people which are now also tested positive which might have been infected in the same situation. The positively-tested user will probably already be in contact with the authorities, and can give now some hints about that situation where the infection might have taken place.
In other cases, another person might get that notification and should have some way of sharing information about such a situation with authorities.

Potential technical issues:

depending on the delay to get a positive test result, the actual time of infection might be too long in the past for the current rules about how many TEKs to keep (14 days)
iOS/Android APIs would have to support multiple data sources / risk evaluation policies for different kind of encounters at the same time (alternatively, only positively tested, which are not interested in risky encounters any more, would take part in that feature, so that data sources / risk evaluation policies could be swapped out to the cluster detection ones)
more accurate timeslots (than days) would have to be reported by APIs for this to be useful

Internal Tracking ID: EXPOSUREAPP-4678

feature request mirrored-to-jira

Source

jrudolph

👍6 👀4

Most helpful comment

This is similar to what we’ve been investigating in the Dutch coronamelder team. We’re working on a paper that we are planning to share with Apple and Google. Feel free to comment on it at https://docs.google.com/document/d/1blYlQHKHc8o7x6F4dYmAlCZ1E21j6e3z5E0OxmLY0WM/edit

ijansch on 8 Nov 2020

👍3

All 12 comments

This is very promising! I have been banging my head since May on how to do this. I think you found the perfect "time-reversal" description of the problem that makes GAEN so much more useful for backwards contact tracing! In hindsight it was so simple...

You can do both forms of risk scoring simultaneously (two API calls), and they each could lead to different outcomes. In forward contact tracing you want a narrow net, and 15 minutes. But for the backward contact tracing and the cluster busting you want the risk scoring to be done with a broader Bluetooth net (attenuation doesn't really matter) but with longer duration, and the result could be "we don't think you are necessarily infected, but please call hotline B so we can ask you where you were" (hotline A is for those at risk of having been infected, triggered via forward contact tracing). The public authority can set the risk threshold according to their ability to investigate backwards, and the returns they see on this mode of investigation (which is subtle).

Some very early great work on modeling the utility of those apps comes directly handy. Unlike predictive models à la Imperial, exact models such as here and particularly here have the direct advantage that they can be repurposed for this type of new configurations. Will think about it more!

pdehaye on 6 Nov 2020

👍1

I am hoping @heinezen will also add it for consideration by the CWA team.

pdehaye on 6 Nov 2020

👍1

@pdehaye thanks for your assessment, I agree that this sounds promising. This could also be combined with #213 (which would require some ENF modification by Google/Apple) to allow for automatic detection of some of the potential high density/cluster events.

Also: I don't think a second set of TEKs would be needed. The info on potential infection dates could simply be encoded in a dedicated TRL state (in legacy v1 mode) or in a specific reportType (in ENF v1.5+ mode) for the TEKs which are representing the probable infection day.

Together with #213 you could also e.g. have a system where devices which observed high density events and get a matching TEK with the special reportType for that day could automatically be asked to self isolate even though the TEK itself wouldn't indicate high levels of infectiousness, just that someone who they met at a high density/cluster event that day probably got infected there, which makes it much more likely that they got infected there as well.

daimpi on 6 Nov 2020

Thinking more about this you could even trigger some automatic warning without #213: let's say your device recorded rolling proximity identifiers (RPIs) which were generated from e.g. two or three different temporary exposure keys (TEKs) marked with the special reportType (which indicates a potential infection day) around the same timeframe you could use this as an indicator for a cluster infection and trigger a warning on this device together with the request to call "hotline B" to inform the contact tracing agency.

daimpi on 6 Nov 2020

I didn't mean that a second set of TEKs would be needed, but that a second evaluation function would be needed.

pdehaye on 6 Nov 2020

👍1

@pdehaye yes agreed. I was referring to OP writing

Now, in addition to that, another set of TEKs should be published that specify the likelihood that the positive-tested person was infected themselves on a given day (based on current scientific consensus about incubation time for covid-19).

when I was saying

I don't think a second set of TEKs would be needed.

daimpi on 6 Nov 2020

👍1

I wasn't sure if I got my point across or if the idea had any merit, so thanks for the feedback. :)

Now, in addition to that, another set of TEKs should be published that specify the likelihood that the positive-tested person was infected themselves on a given day (based on current scientific consensus about incubation time for covid-19).

I don't think a second set of TEKs would be needed.

I guess I wasn't clear enough about that. I didn't mean to say that there should be another set of TEKs announced over BLE, but that the TEKs spanning the cluster detection timespan would be different (but maybe slightly overlapping) from the ones spanning the existing contact risk ones, and would have to be selected differently for submitting (= "publishing") to the server. Or maybe I got something wrong about the protocol and an infected person will always publish all their own TEKs to the server anyway (with very low risk profile for the early days of the 14-day interval)?

jrudolph on 6 Nov 2020

@jrudolph thanks for your suggestion, getting CWA (and ENF) ready for some backward contact tracing / cluster detection sounds like a promising way forward with this virus whose spread is characterized by overdispersion.

I didn't mean to say that there should be another set of TEKs announced over BLE

Just a small remark on that: TEKs are not broadcasted by devices, rather rolling proximity identifiers (RPIs) which are derived from TEKs will be broadcasted in the form of BLE beacons (cf. here).

Or maybe I got something wrong about the protocol and an infected person will always publish all their own TEKs to the server anyway (with very low risk profile for the early days of the 14-day interval)?

That is indeed correct: an infected person will always upload all of the TEKs which are available on their device i.e. up to 14 (one for each of the last 14 days) and each TEK gets a transmission risk level (TRL) assigned with the oldest TEKs having the lowest TRL: https://github.com/corona-warn-app/cwa-documentation/blob/master/transmission_risk.pdf.

daimpi on 6 Nov 2020

[EDIT: Added some edits for more precision]

That is indeed correct: an infected person will always upload all of the TEKs which are available on their device i.e. up to 14 (one for each of the last 14 days) and each TEK gets a transmission risk level (TRL) assigned with the oldest TEKs having the lowest TRL: https://github.com/corona-warn-app/cwa-documentation/blob/master/transmission_risk.pdf.

This is true in Germany, but not in Switzerland for instance. We have a specific law [actually: a law and an ordinance] that says only keys from S-2 or T+0 have to be uploaded, where S is the date of symptoms for symptomatics and T the date of tests for asymptomatics. We also don't use the [advanced] risk scoring[, i.e. only use the threshold binning].

We can see that Switzerland is trying to do [digital] contact tracing only [forward], while Germany is doing some hybrid as it also could occur that the infector is notified via the infectee.

pdehaye on 7 Nov 2020

👍1

@pdehaye interesting, I didn't know that CH handles this differently.

We have a specific law that says only keys from S-2 or T+0 have to be uploaded, where S is the date of symptoms for symptomatics and T the date of tests for asymptomatics.

Wow that sounds awfully restrictive even for pure forward tracing. Have they taken the He et al. correction into account when deciding on those cutoffs?

We can see that Switzerland is trying to do forward contact tracing only, while Germany is doing some hybrid as it also could occur that the infector is notified via the infectee.

Tbh: I don't think the CWA team implemented this with the intention of "backward tracing" in mind. This was probably just the most straightforward way to implement diagnosis key (DK)/TEK upload and the (very crude) possibility of backward tracing is just an unintended byproduct 😉.
A good thing in this context is that CWA at least doesn't hide encounters which are below the minimum risk score (my understanding is that SwissCovid hides those) instead they will be shown as green/low-risk encounters. This is quite important in this context b/c in a lot of cases the TEKs/DKs for the days where the infection happened will have a TRL of 0 which means no matter how intense the contact was the encounter will always be below the minimum risk score.
So while better than SwissCovid this behavior is o/c far from optimal: for one, green encounters don't trigger a push warning and even the day of the encounter is hidden by CWA in such cases which unfortunately makes the current implementation almost useless for proper backward tracing imho ^^.

daimpi on 7 Nov 2020

ijansch on 8 Nov 2020

👍3

@ijansch thanks for sharing this document, this looks promising and I've left you some comments 🙂.
One thing which you're not currently mentioning there is whether something like RPI density (cf. this comment above) should also factor in when trying to determine the "source event risk". It's not entirely clear to me whether or not this would actually help with the detection of such source risks, probably some modeling would be required to figure this out.
Another thing which could have nice synergies with this approach is "venue registration" (e.g. #138). If CWA could communicate the timeframe of "venue presence" of the user to ENF, and ENF found a potential "exposed" event during the venue presence that could be taken as further evidence of a cluster event and should probably be communicated to the health authorities and the venue.

Btw: feel free to join us on our community Slack 🙂.

daimpi on 10 Nov 2020

Was this page helpful?

0 / 5 - 0 ratings