Skip to main content
An official website of the United States government

MutSpliceDB

MutSpliceDB: A Database of Splice Sites Variants

MutSpliceDB documents mutation effect(s) on splicing (such as exon inclusion/exclusion or intron retention) based on RNA-seq BAM files from sample(s) with particular splice site mutations.

The research community can propose additional splice site mutations for inclusion in this public resource when RNA-seq based evidence is available.

Access MutSpliceDB

Access MutSpliceDB

Inquiries and Evidence Submission

Email Dr. Dmitriy Sonkin (dmitriy.sonkin@nih.gov).

Publication

Palmisano A, Vural S, Zhao Y, Sonkin D. MutSpliceDB: A database of splice sites variants with RNA-seq based evidence on effects on splicing. Hum Mutatation. 2021;42(4):342-345. doi:10.1002/humu.24185 [PubMed Abstract]

Disclaimer

MutSpliceDB is a free resource developed by Computational and Systems Biology Branch (Biometric Research Program, DCTD/NCI) and it is intended for research purposes only. It should NOT be used for emergencies or medical or professional advice.

About MutSpliceDB

Splice site mutations are one of the well-known classes of genetic alterations playing an important role in biology. In cancer, splice sites are most frequently observed as inactivating alterations in tumor suppressor genes (for example, TP53 or RB1) and to a lesser degree as activating alterations in oncogenes (for example MET). Splice site mutations may lead to alterations in mRNA transcripts, causing for example exon(s) inclusion/exclusion or, intron retention. Interpreting the consequences of a specific splice site mutation is not straightforward, especially if the mutation is located outside of the canonical splice sites. Accurate interpretation of the impact a splice site mutation has can further our understanding of biology, influence patient treatment, and, in case of germline splice site mutations, may also have relevance to familial disease predisposition.

To facilitate the interpretation of splice site mutation effects, we developed MutSpliceDB: a database of splice sites variants, documenting mutation effect(s) on splicing based on RNA-seq BAM files from sample(s) with particular splice site mutations.

For each splice site mutation, the resource contains the following information: 

  • gene symbol;
  • Entrez gene ID;
  • HGVS compliant transcript based variant notation;
  • allele registry ID;
  • description of the splicing effect;
  • sample name;
  • sample source;
  • name of RNA-seq BAM file;
  • splicing effect image snapshot;
  • mini BAM file with reads only for relevant gene (if there is no restrictions on nucleotide level data distribution);
  • if the RNA-seq BAM file does not contain reads with splice site mutation (e.g., due to exon skipping), the name of BAM file with DNA sequencing data.

All entries in MutSpliceDB are based on publicly available RNA-seq BAM files. The initial release of MutSpliceDB (2019) contained detailed information for a subset of splice site mutations derived from publicly available RNA-seq data from Cancer Cell Lines Encyclopedia (CCLE) and The Cancer Genome Atlas (TCGA). We add information for more splice site mutations as soon as the necessary evidence becomes available. 
 

How to explore the MutSpliceDB database:

1. On MutSpliceDB main web page, click the "Access MutSpliceDB" link to open the resource database (we recommend the use of Google Chrome or Mozilla Firefox web browser to access MutSpliceDB);

2. The database table lists the genetic alterations included in the resource. The table has dynamic controls that allow sorting (by clicking on the column headers) and search filters. The table can be exported as CSV or Excel, and active weblinks connect each entry to the corresponding GeneCard, NCBI and ClinGen Allele Registry entry. 
To further explore the details of a specific entry, click the "BAM file(s) page" link to open the page with supporting evidence. 

3. Each entry contains the supporting evidence regarding the splice variants. The table can contain multiple entries, each with links to external resources (for example, GDC or CCLE) and files (image and mini BAM) that researchers can download. If the user does not want to download the data, the mini BAM file can be visualized using the web IGV website by clicking the provided link.

How to Submit an Entry

MutSpliceDB is open for submissions from the molecular genetics community. Requests to add entry to MutSpliceDB should be addressed to Dr. Dmitriy Sonkin (dmitriy.sonkin@nih.gov) and should contain the following:

  • All the splice site mutation information listed in the About MutSpliceDB section above,
  • Image snapshots, and
  • Mini BAM files (if there is no restrictions on nucleotide level data distribution) obtained as explained below.

Image Snapshot Requirements

Image snapshot files should show the splicing effect of the mutations and contain the following information: 

  • Gene Symbol,
  • Relevant exon numbers and HGVS nomenclature compliant transcript based variant notation, and
  • MANE Select/Plus transcript ID, if possible.

Image snapshot filenames should have the following structure: SampleName_GeneSymbol_AlleleRegistryID.jpeg.

For example, an image showing the splicing effects of TP53 mutation (NM_000546.5:c.375+5G>A) with Allele Registry ID CA645589233 in cell line PK-45H should have the following name: PK-45H_TP53_CA645589233.jpeg. Allele Registry ID for a variant can be found or generated using the ClinGen Allele Registry.

Mini BAM File Requirements

Mini RNA-seq BAM filenames should have the following structure: RNAseq BamFileName_GeneSymbol_mini.bam.

For example, mini BAM file for cell line PK-45H with TP53 mutation should have the following name: G27478.PK-45H.2_TP53_mini.bam. In this case G27478.PK-45H.2 is taken from the CCLE RNA-seq BAM file name G27478.PK-45H.2.bam.

To create the mini BAM files using Samtools, follow the steps below:

  • samtools view RNA-seq.bam chr:start-end -b > mini.bam
  • samtools index mini.bam

RNA-seq.bam file should be sorted and indexed. The instructions above create a sorted mini bam file (mini.bam) and the corresponding index file (mini.bam.bai). In Samtools view command 'chr' should be replaced with chromosome number, 'start' should be replaced with genomic position 100 bp before the start of first coding exon, and 'end' should be replaced with genomic position 100 bp after the end of last coding exon. Select the first and last coding exons in a way that covers all existing gene isoforms.

Selected Reference

  1. Palmisano A, Vural S, Zhao Y, Sonkin D. MutSpliceDB: A database of splice sites variants with RNA-seq based evidence on effects on splicing. Hum Mutatation. 2021;42(4):342-345. doi:10.1002/humu.24185

    [PubMed Abstract]
Email