@henrikstranneheim @moonso
@AnnHam alerted me to a presumably causative frameshift variant in a case with a somewhat low-ish score (12). Not hopeless, but could be better. In particular, she noticed there not being any loss of function (gene intolerance) contribution, although the pLI on exac is very very close to 1.
Looking at the 1.20 model, I'm confused: it appears the scoring has been inverted to give genes with low intolerance the higher score and conversely. Do you do some additional conversions in or after VEP LoFtool annotation, or might this have been a slip on the keyboard?
You are right we seem to have reversed the scales. Have updated production to:
[Gene_intolerance_score]
field = INFO
data_type = float
category = Gene_intolerance_prediction
record_rule = max
separators = None
info_key = CSQ
csq_key = LoFtool
description = Exac gene intolerance prediction
[[not_reported]]
score = 0
[[low]]
score = 0
lower = 0
upper = 0.0001
[[medium_pos]]
score = 1
lower = 0.0001
upper = 0.01
[[high_pos]]
score = 2
lower = 0.01
upper = 1
Great - much better. I wonder if the limits for medium and high should not also be shifted upwards. Perhaps >0.98-0.99 or so for a high? E,g. the gene in question had 0.9999999999997 or so.
The LoF score will not contribute so much to the rank score result. Any other suggestion on how to push these kinds even more?
Let's do it for MIP7.0 and add more weight to the really intolerant ones
How about this:
[gene_intolerance_score]
field = INFO
data_type = float
category = gene_intolerance_prediction
record_rule = max
separators = None
info_key = CSQ
csq_key = LoFtool
description = Exac gene intolerance prediction
[[not_reported]]
score = 0
[[low_intolerance]]
score = 0
lower = 0
upper = 0.98
[[medium_intolerance]]
score = 2
lower = 0.98
upper = 0.99
[[high_intolerance]]
score = 4
lower = 0.99
upper = 1
Looks good to me, but the arbitrariness is a little scary. We can't increase the score too much as this is a per gene score that is dealt out regardless if the variant has any reasonable chance of causing a loss of function. But, this should have shoved the variant in question up to 16. Many of these will also receive an additional new indel CADD contribution from MIP7 onwards.
Based on some rather general statements (e.g. ExAC FAQ "We consider pLI >= 0.9 as an extremely LoF intolerant set of genes.") I'd suggest making the medium category 0.9 >= pLI >= 0.99.
Done
Ok; closing this for now then. Let's revisit frameshift ranking when we have validation cases from MIP7 at hand!
Most helpful comment
You are right we seem to have reversed the scales. Have updated production to:
with