Scout: LoFtool / pLI score rank confusion?

Created on 18 Jan 2019  路  9Comments  路  Source: Clinical-Genomics/scout

@henrikstranneheim @moonso
@AnnHam alerted me to a presumably causative frameshift variant in a case with a somewhat low-ish score (12). Not hopeless, but could be better. In particular, she noticed there not being any loss of function (gene intolerance) contribution, although the pLI on exac is very very close to 1.

Looking at the 1.20 model, I'm confused: it appears the scoring has been inverted to give genes with low intolerance the higher score and conversely. Do you do some additional conversions in or after VEP LoFtool annotation, or might this have been a slip on the keyboard?

done

Most helpful comment

You are right we seem to have reversed the scales. Have updated production to:

  • rank_model_cmms_-v1.21-.in
  • svrank_model_cmms_-v1.4-.ini
    with
[Gene_intolerance_score]
  field = INFO
  data_type = float
  category = Gene_intolerance_prediction
  record_rule = max
  separators = None
  info_key = CSQ
  csq_key = LoFtool
  description = Exac gene intolerance prediction

  [[not_reported]]
    score = 0

  [[low]]
    score = 0
    lower = 0
    upper = 0.0001

  [[medium_pos]]
    score = 1
    lower = 0.0001
    upper = 0.01

  [[high_pos]]
    score = 2
    lower = 0.01
    upper = 1

All 9 comments

You are right we seem to have reversed the scales. Have updated production to:

  • rank_model_cmms_-v1.21-.in
  • svrank_model_cmms_-v1.4-.ini
    with
[Gene_intolerance_score]
  field = INFO
  data_type = float
  category = Gene_intolerance_prediction
  record_rule = max
  separators = None
  info_key = CSQ
  csq_key = LoFtool
  description = Exac gene intolerance prediction

  [[not_reported]]
    score = 0

  [[low]]
    score = 0
    lower = 0
    upper = 0.0001

  [[medium_pos]]
    score = 1
    lower = 0.0001
    upper = 0.01

  [[high_pos]]
    score = 2
    lower = 0.01
    upper = 1

Great - much better. I wonder if the limits for medium and high should not also be shifted upwards. Perhaps >0.98-0.99 or so for a high? E,g. the gene in question had 0.9999999999997 or so.

The LoF score will not contribute so much to the rank score result. Any other suggestion on how to push these kinds even more?

Let's do it for MIP7.0 and add more weight to the really intolerant ones

How about this:

[gene_intolerance_score]
  field = INFO
  data_type = float
  category = gene_intolerance_prediction
  record_rule = max
  separators = None
  info_key = CSQ
  csq_key = LoFtool
  description = Exac gene intolerance prediction

  [[not_reported]]
    score = 0

  [[low_intolerance]]
    score = 0
    lower = 0
    upper = 0.98

  [[medium_intolerance]]
    score = 2
    lower = 0.98
    upper = 0.99

  [[high_intolerance]]
    score = 4
    lower = 0.99
    upper = 1

Looks good to me, but the arbitrariness is a little scary. We can't increase the score too much as this is a per gene score that is dealt out regardless if the variant has any reasonable chance of causing a loss of function. But, this should have shoved the variant in question up to 16. Many of these will also receive an additional new indel CADD contribution from MIP7 onwards.

Based on some rather general statements (e.g. ExAC FAQ "We consider pLI >= 0.9 as an extremely LoF intolerant set of genes.") I'd suggest making the medium category 0.9 >= pLI >= 0.99.

Done

Ok; closing this for now then. Let's revisit frameshift ranking when we have validation cases from MIP7 at hand!

Was this page helpful?
0 / 5 - 0 ratings