Data and Software Resources
The Genomic Medicine Center has developed a variant database and several novel software for genome sequence analysis. The resources below are free for academic research use to help advance genomic research and medicine.
Children’s Mercy Variant Warehouse Database
A variant warehouse of non-identifiable, summary data on all variants identified in the Children’s Mercy Genomic Medicine Center is publicly available in the Children's Mercy Variant Warehouse Database. The database can be searched and viewed with genomic annotations, population database cross-references such as ClinVar, gnomAD and dbSNP, ACMG curations and a local allele frequency. Variant data are available for bulk download as an annotated VCF. The database is updated quarterly.
VIKING software
VIKING (Variant Integration and Knowledge Interpretation in Genomes) is software for variant interpretation and reporting. VIKING allows medical geneticists to view variants for an individual patient or family along with the characterization information produced by the RUNES software. It enables users to review millions of variants detected in each individual by providing dynamic filters to search by variant category, gene, minor allele frequency and variant quality.
VIKING provides tools to analyze results from a patient and family members to identify potentially diagnostic variants using inheritance patterns. Once identified, variants of interest can be categorized and added to customized reports.
RUNES software
The Children's Mercy variant characterization pipeline (RUNES: Rapid Understanding of Nucleotide variant Effect Software) is a multi-stage analysis pipeline for annotating and classifying human nucleotide variation.
Characterization process
Characterization is divided into multiple independent stages that record zero or more annotations for each variant according to the type of characterization being performed by the stage. Variant effect is predicted using in silico prediction tools and cross-referencing with external databases. The stages include:
- ENSEMBL Variant Effect Predictor (VEP)
- CMH splice impact evaluator
- CMH transcript context characterizer
- Mitochondrial gene context characterizer
- Comparison with external databases:
- dbSNP
- Human Gene Mutation Database (HGMD)
- ClinVar
- Genome Aggregation Database (gnomAD)
- Exome Aggregation Consortium (ExAC)
- Catalog of Somatic Mutations in Cancer (COSMIC)
Variant classification
At the end of characterization, variant annotations are aggregated and submitted to a variant classifier. The classifier assigns a severity category to each variant based on the accumulated evidence, with the most damaging category kept as the final categorization. The categories and criteria include:
Category 1
Description |
Criteria |
Previously reported, recognized cause of the disorder |
HGMD variant type of ‘Disease Mutant’ dbSNP/ClinVar Snp Clinical Significance of ‘pathogenic’ |
Category 2
Description |
Criteria |
Novel, of a type expected to cause the disorder |
loss of initiation premature stop codon disruption of stop codon whole transcript deletion frameshifting in/del disruption of splicing through deletion causing CDS/intron fusion overlap with splice donor or acceptor sites. |
Category 3
Description |
Criteria |
Novel, may or may not be causative |
non-synonymous substitution in-frame in/del disruption of polypyrimidine tract overlap with 5’ exonic, 5 ‘ flank, 3’ exonic, 5’ intronic or 3’ flank splice contexts overlap with mitochondrial gene |
Category 4
Description |
Criteria |
Novel, probably not causative of disease |
all variants not in categories 1 – 3 synonymous AA changes pyrimidine substitutions in polypyrimidine tract, other intronic variants dbSNP GMAF of greater than 0.02 ExAC ethnicity specific allele frequency calculated from at least 2000 alleles greater than 0.02 and less than 0.05 |
Category 5
Description |
Criteria |
Known neutral variant |
ClinVar Snp Clinical Significance of ‘non-pathogenic’ or ‘probably non-pathogenic’ ExAC ethnicity specific allele frequency calculated from at least 2000 alleles greater than 0.05 |
Astrolabe software
Astrolabe is software for translating whole genome sequence data into pharmacogenetic information that can be used to guide medication selection, dosing and prescription. Clinicians and researchers can use this information to reduce adverse drug events and maximize medication efficacy.
Pharmacogenetic allele identification
Ongoing pharmacogenetic research defines drug/gene connections and maps drug response to specific alleles. Gene alleles are defined by specific genetic variants and assigned identifiers that form an allele nomenclature that is used by public and private resources to connect variation to gene activity.
Astrolabe takes genetic variants and sequences generated by whole genome, whole exome and targeted panel sequencing and compares it to the catalog of pharmacogenetic alleles to determine which pair of alleles a patient carries.
Astrolabe was initially developed for the CYP2D6 gene, then extended to CYP2C9 and CYP2C19 with additional genes in the process of being validated. Astrolabe is integrated with the PharmVar (Pharmacogene Variation Consortium) database, a National Institutes of Health-funded project produced as a collaboration between the Children’s Mercy Clinical Pharmacology program and the Center for Pediatric Genomic Medicine.
Access to software
The VIKING, RUNES, and Astrolabe software are freely available for academic research use through the Genomic Medicine Software Portal. For all other uses, please email bioinformatics@cmh.edu for license information.
For additional data resources available to approved researchers, visit our Genomic Answers for Kids Project Data page.