Scout: transcript listed twice with different cDNA change

Created on 3 Feb 2021 · 3Comments · Source: Clinical-Genomics/scout

This may be related to issue #2275? But in this case, the same transcript ID is listed twice with different cDNA/protein change for a variant:

(variant: https://scout.scilifelab.se/cust003/19256-2/15833b2ad523d6e21ffada388e0e7dc3)
For some reason, transcript NM_001288653 gets two entries in the transcript table and the variant's cDNA annotation differ in the two entries. The upper entry is correct. I'm hoping there is an easy solution :) Thanks guys.

question

Source

ielvers

Most helpful comment

Thanks for the explanations, and for all your work! Feel free to close the issue.

ielvers on 10 Feb 2021

😄1 👍1

All 3 comments

Thanks again! This is not elegant, and is under work right now (https://github.com/Clinical-Genomics/scout/pull/2279)
but to be fair to the current version, it is not exactly the same transcript. Two things to notice here.

The first row is actually an ENSEMBL transcript (see the ID like ENST0000* - thats ENSEMBL for you). For convenience, it lists transcripts that ENSEMBL think are similar to it in RefSeq. As this is their hg19 version, it is rather inexactly mapped and we can't really complain about it since it is no longer the current version. 😿

The hg38 is much cleaner, and in particular contain MANE status that carefully curate the mapping between RefSeq and ENSEMBL. See how it for instance would indicate that NM_004859 and NM_001288653 are the same, which they ofcourse are not.

The second row is a real RefSeq transcript, including version number which will allow you to actually find it. Without version, it could just be whatever aa change anyway, although this particular one seems to be .1. Now, the latest is actually NM_001288653.2, dated 16 dec 2020, which then likely came out after the MIP references were last updated.

tl;dr: sorry for the mess, but it will autocorrect in hg38. And we have another workaround in the works to try to make it one more step better, though with garbage in, we will not be 100% certain.

dnil on 3 Feb 2021

Hi @ielvers,

would it be ok to close this? I hope you enjoyed my meandering essay, because here goes another!

I can't see anything going out of the expected here. There are more that few things to keep in mind regarding transcripts, so I am more than sympathetic to some confusion. Lets take it step by step to figure out what is going on in this table:

1) The primary transcript for the gene from HGNC is NM_004859 - see genenames.org:
Screenshot 2021-02-09 at 18 30 59 . The closest transcript that matches is EST00000269122, which ENSEMBL loosely maps as its most similar to RefSeq NM_004859. Color & badge thus for primary.

2) EST00000269122 is also the closest match from ENSEMBL to a few other RefSeq transcripts (according to VEP), including NM_001288653 (version is not given here, but given that it is closest to to so many different RefSeq transcripts, version is probably not overly important here).

3) MIP/VEP also annotated the variant on NM_001288653.1 for good measure. This is a slightly outdated version compared to what is in GenBank (NM_001288653.2), but hey, that was released just recently, and it is versioned so no error really.

This will still (mostly) autocorrect in hg38 - for which there is an extensively manually curated mapping between RefSeq and ENSEMBL (MANE). We are indeed working on an update that will flag the best available match to the primary transcript, allowing that to go to the RefSeq only entry if appropriate, but it will not make a difference for this particular example as the RefSeq transcript listed on its own above is not the HGNC primary.

Hope this helps and was not even more confusing!

dnil on 9 Feb 2021

Thanks for the explanations, and for all your work! Feel free to close the issue.

ielvers on 10 Feb 2021

😄1 👍1

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Comments return to anchor?

dnil · 3Comments

Add synopsis, phenotype_terms and cohorts to load config

dnil · 5Comments

Problems after update to 4.1.0

1ctw · 5Comments

"in COSMIC and in Clinvar" filter in variant list view.

hassanfa · 3Comments

ClinVar - Functional consequences

4WGH · 3Comments