Large insertion (when >25 bp) is likely reported as a structural variant by Archer® pipeline and will not have a correct HGVS nomenclature. This Web App is built with Python script and Flask to automatically generate HGVS format nomenclature of FLT3-ITD (≥18 bp). The purpose of this script is to facilitate easy and appropriate naming of FLT3-ITD, from the sequencing results generated through Archer® VariantPlex® Myeloid panel. In addition, more details of the ITD can be deciphered at protein level. Assembled sequencing results from other methods and RNAseq results should also be correctly named through this script, but I have not tested any.
Visit HGVS website for current recommendations on sequence variant nomenclature
1. The input is sequencing result (the assembled read, consensus read, or “reference read”) from JBrowser (Archer Pipeline). Assembled sequence obtained from other methods should work too.
2. For insertion ≥18 bp only. Small insertions should be named correctly by Archer or other analysis pipelines.
3. Mutations/read errors are allowed at the insertion site. Any sequence change will make it insertion instead of "duplication"; and the variant sequence are added to the 5'-end before the partial duplication (per 3'-end rule).
4. If there is no sequence variant at the insertion site, it is reported as duplication, and the duplication start-end (c.DNA or p.AA number) will be included in the nomenclature; otherwise it is reported as insertion; the partial duplication is reported; and the full duplication/insertion sequences are reported on separate lines.
Steps 1-3 for copying sequencing result from JBrowser. If you have another way to copy your sequencing result, go to step 4 directly.
1. Go to Archer pipeline, click Visualize plot, then click “Visualize” button, JBrowser will open in another window/tab.
2. JBrowse gives you the assembled sequence.
3. Under Reference Sequence pull-down menu, choose Save track data -> View FASTQ sequence.
----------4. Copy the sequence read into the open box (do not copy any character other than A, T, G, C, otherwise the script will not work).
5. Click the file “Get FLT3-ITD Nomenclature” button.
6. The nomenclature text will appear in another window/tab. You just need to paste it to your report.
7. If you know there is insertion but the script generates no result, it is a suggestion that there is some mutation in the ITD. You will need to generate a name manually.
8. Version 4.0 added dynamic link to submit FLT3-ITD sequence: /nucleotideSequence/
App update history:
V2.10 (web app) on 11/18/2021
1. Make the final nomenclature match HGVS format as much as possible.
2. Excise intron 14-15 before translation: when, 1) the full length intron with flanking splice signals at both exon/intron junctions are present and 2) the intronic region translates to correct AAs (i.e., if any NT variants are present away from interface but have not change the standard genetic code). However, if there is only partial intronic sequence, or the splice signal is disrupted (at the intron-exon junctions), it will be translated as inserted AA sequence.
3. Resolve the issue of incorrect c.DNA number when dup/ins extends into intronic region. NTs in intron 14-15 are numbered c.1837+N or c.1838-N (N<46).
4. When mutation/mismatch nucleotides are present at the insertion site, the nomenclature still reflects a partial duplication at NT, AA and chr. level (e.g., "insATC;dup1825_1836" indicates a 3 bp insertion followed by duplication c.1825_1836)
Updated in v3.1 on 02/22/2022:
Completed validation on 114 assembled sequences (result from 84 cases) and
10 artificial sequences.
Added a reference (USCAP abstract -> this is updated to IJLH published article on 7/11/2022: PMID: 35795913 DOI: 10.1111/ijlh.13930).
Moved full sequence to "Additional information" (not required per HGVS recommendations).
Added a statement about potential 3'-end variant(s).
Added a statement regarding protein (p.) nomenclature
when intronic sequence is present in ITD (the translation is presumptive, to our best knowledge).
Updated in v3.2 on 04/12/2022: Added an alert if the ITD is a triplicate (very rare situation, validated with the 11th artificial sequence).
Updated in v3.3 on 04/18/2022:
Add N to the letter recognizable as nucleotide sequence; when N is present in ITD, give a warning.
Modified some words in the text for errors (no ITD detected situations).
Updated in v4.0 on 02/28/2023:
Added dynamic link /nucleotideSequence/your_sequence to submit FLT3-ITD sequence and returns nomenclature as JSON file.