Part 4

Computational Toolkit

Comprehensive collection of computational tools used in STRC variant analysis and hearing loss genetics research. All tools are freely accessible to enable independent genetic variant research.

143
Total tools
45
Verified / Used
49
E1659A tested

AI & Foundation Models

1 tools

Structural Biology

22 tools

AlphaFold 3 Server

VERIFIED

Predicts 3D structures of protein complexes (protein-protein, protein-DNA, protein-ligand)

FREE

AlphaFold Database

VERIFIED

Provides predicted 3D protein structures for nearly every known protein

E1659A: pLDDT 68.75 — Moderate confidence at E1659
FREE

DDGun

TESTED

Untrained algorithm predicting the folding stability impact of amino acid substitutions.

API FREE

DUET

TESTED

Integrates mCSM and SDM into a consensus prediction of protein stability upon mutation.

API FREE

DynaMut

TESTED

Analyzes mutational impacts on protein dynamics and vibrational entropy.

E1659A: -0.913 kcal/mol — Destabilizing
API FREE

ELASPIC

DOWN

Evaluates the effect of mutations on protein folding and protein-protein interactions.

API FREE

ESMFold

VERIFIED

Predicts 3D protein structure from amino acid sequence alone

FREE

FoldX

TESTED

Calculates empirical energy terms for evaluating wild-type and mutant protein stability.

FREE

IFUM

TESTED

Jointly estimates absolute folding stability (DeltaG) and equilibrium structural ensembles.

API FREE

IUPred3

VERIFIED

Predicts intrinsic disorder from protein sequence

FREE

Mol Star Viewer

VERIFIED

Interactive 3D protein structure visualization

FREE

NetGPI

VERIFIED

Predicts GPI-anchor signal presence

FREE

NetNGlyc

VERIFIED

Predicts N-linked glycosylation sites (NXS/NXT motifs)

FREE

PoPMuSiC

DOWN

Predicts changes in thermodynamic stability caused by single site mutations.

FREE

PyMOL

TESTED

Publication-quality 3D protein structure rendering

RoseTTAFold

TESTED

Predicts 3D structures and models complex multi-protein biological assemblies.

API FREE

RosettaDDG

TESTED

Python wrapper automating high-throughput free energy calculations for variants.

API FREE

SDM

DOWN

Calculates the difference in stability based on environment-specific amino acid substitution tables.

API FREE

STRUM

DOWN

Predicts \Delta\DeltaG using quantitative structure-activity relationship (QSAR) models.

FREE

SWISS-MODEL

VERIFIED

Builds 3D protein models by homology to known experimental structures

FREE

SignalP

VERIFIED

Predicts signal peptide presence and type (Sec/SPI, Sec/SPII, Tat/SPI)

FREE

mCSM

TESTED

Predicts stability and binding affinity changes utilizing graph-based spatial signatures.

API FREE

Variant Effect Prediction

41 tools

Allen Brain Atlas

TESTED

High-resolution imaging and transcriptomics mapping genetic expression strictly to brain anatomy.

E1659A: N/A — STRC not highly expressed in brain
API FREE

AlphaMissense

VERIFIED

Predicts pathogenicity of missense variants (amino acid substitutions)

E1659A: 0.9016 — Likely Pathogenic
FREE

BayesDel

TESTED

Evaluates coding and non-coding variants utilizing a Bayesian modeling framework.

E1659A: 0.2255 — Damaging
FREE

CADD

VERIFIED

Scores all variant types: missense, synonymous, intronic, intergenic, UTR, splice

E1659A: PHRED 25.5 — Top 0.3% deleterious
FREE

CIViC

VERIFIED

Open-source, crowd-sourced database for clinical interpretation of variants in cancer.

API FREE

Caduceus

TESTED

Genomic language model leveraging the Mamba architecture with reverse-complement equivariance.

API FREE

ClinPred

TESTED

Meta-predictor combining functional VEP scores with clinical allele frequencies.

E1659A: 0.9869 — Very strong pathogenic signal
FREE

DANN

TESTED

Uses deep neural networks to score the deleteriousness of genetic variants.

E1659A: 0.9946 — Highly deleterious
FREE

DNABERT-2

TESTED

Multi-species modeling tool employing byte pair encoding and refined transformer configurations.

API FREE

ESM1v

VERIFIED

Evaluates missense variants using zero-shot protein language modeling.

E1659A: -2.733 (ESM-1v) / -1.718 (ESM-2) — Damaging. E is most preferred residue at position 1659. A ranks 3rd worst of 20.
API FREE

EVE (Evolutionary model of Variant Effect)

TESTED

Maps viral and human fitness landscapes via deep generative models.

API FREE

Eigen

TESTED

Uses unsupervised spectral approaches to aggregate functional annotations into a single score.

E1659A: N/A — Scores available via dbNSFP, unsupervised genome-wide functional score
API FREE

Evo / Evo 2

TESTED

40-billion parameter genomic foundation model predicting and generating tasks across DNA and RNA.

API FREE

Expression Atlas

TESTED

Database of gene and protein expression across species and biological conditions.

E1659A: 4562 experiments — STRC expression data across tissues
API FREE

FATHMM-MKL

TESTED

Integrates multiple kernel learning to predict the functional consequences of SNVs.

E1659A: 0.9748 — Strong damaging signal
API FREE

FAVOR

DOWN

Facilitates the annotation of variants based on their functional consequences.

API FREE

Franklin

VERIFIED

AI-assisted ACMG/AMP variant classification

Galaxy Project

TESTED

Open-source, web-based platform for highly accessible and reproducible genomic data analysis.

API FREE

HGMD Professional

AVAILABLE

Exhaustive, expert-curated database of germline mutations underlying inherited human diseases.

API

Human Protein Atlas

VERIFIED

Spatial omics atlas mapping all human proteins across tissues, blood, and single cells.

API FREE

HyenaDNA

TESTED

Genomic sequence model utilizing implicit convolutions to achieve million-token data contexts.

API FREE

InterVar

VERIFIED

Automated application of ACMG/AMP criteria (PVS1, PS1-4, PM1-6, PP1-5, BA1, BS1-4, BP1-7)

E1659A: Deleterious — SIFT 0.0 + PolyPhen 0.991 via Ensembl VEP
FREE

LINSIGHT

TESTED

Estimates the fitness consequences of non-coding mutations.

E1659A: N/A — Non-coding focused, less relevant for coding variant
API FREE

MPC

TESTED

Calculates missense badness, PolyPhen-2, and constraint metrics for variant scoring.

E1659A: N/A — Score available via dbNSFP for constrained regions
FREE

Mastermind

AVAILABLE

AI-driven genomic intelligence platform indexing variants from over 11 million full-text articles.

API FREE

MetaRNN

DOWN

Prioritizes rare non-synonymous SNVs via recurrent neural networks.

E1659A: 0.8552 — Damaging
FREE

MutScore

BROKEN

Assesses the specific fitness effects and loss-of-function potential of genetic variants.

API FREE

Nucleotide Transformer (NTv3)

TESTED

Unified foundation model pre-trained on 9 trillion base pairs for molecular phenotype prediction.

API FREE

Open Targets

VERIFIED

Platform supporting the systematic identification and prioritization of therapeutic drug targets.

E1659A: 0.731 — 73 STRC disease associations
API FREE

PharmGKB

TESTED

Comprehensive resource detailing how genetic variation directly affects drug response.

E1659A: PA38082 — STRC gene entry exists, no pharmacogenomic interactions
API FREE

PrimateAI-3D

TESTED

Evaluates missense variants by integrating 3D protein structures and primate genomics.

E1659A: N/A — Primate-specific pathogenicity, available via SpliceAI Lookup
API FREE

REVEL

VERIFIED

Ensemble score from 13 tools: MutPred, FATHMM, VEST, PolyPhen, SIFT, PROVEAN, MutationAssessor, MutationTaster, LRT, GERP, SiPhy, phyloP, phastCons

E1659A: 0.789 — Pathogenic range
FREE

SnpEff

TESTED

Annotates variants and predicts their localized structural and sequence ontology impacts.

API FREE

SparkINFERNO

DOWN

Analyzes variants for broad phenotypic and functional impacts in scalable cloud environments.

FREE

Terra.bio

TESTED

Scalable platform enabling researchers to run bioinformatics tools securely on Google/Azure clouds.

API

VEST4

DOWN

Random forest classifier predicting the statistical probability of a variant being pathogenic.

E1659A: 0.5900 — Moderate pathogenic signal
FREE

VarSome

VERIFIED

Aggregates: ClinVar, gnomAD, REVEL, CADD, SpliceAI, conservation scores, literature

E1659A: VUS — PM2_Supporting + BP1

dbNSFP

VERIFIED

30+ predictor scores per variant: REVEL, CADD, SIFT, PolyPhen-2, MutationTaster, FATHMM, GERP, PhyloP, PhastCons, AlphaMissense, and more

E1659A: 40 scores — All extracted via myvariant.info
FREE

fitCons

TESTED

Clusters genomic positions by functional annotations to estimate fitness consequences.

E1659A: N/A — Evolutionary fitness score available via dbNSFP
API FREE

gMVP

TESTED

Predicts missense variant pathogenicity employing a sophisticated graph neural network.

E1659A: N/A — Graph-based structural score, requires local computation
API FREE

variant tools (vtools)

TESTED

Flexible command-line toolset for the storage, annotation, and dynamic filtering of sequence variants.

API FREE

Splicing Prediction

12 tools

AbSplice2

TESTED

Tissue-specific contextual filter estimating the probability of aberrant splicing events.

E1659A: N/A — Not splice-affecting (missense variant)
API FREE

FRASER2

TESTED

Intron-centric aberrant splicing caller utilizing empirical RNA-seq data.

API FREE

GeneSplicer

TESTED

Detects splice sites in genomic DNA sequences using maximal dependence decomposition.

E1659A: N/A — Not splice-affecting (missense variant)
API FREE

HAL

DOWN

High-throughput alternative splicing prediction platform.

FREE

MMSplice

TESTED

Modular modeling framework predicting the usage of cassette exons.

E1659A: N/A — Not splice-affecting (missense variant)
API FREE

MaxEntScan

TESTED

Evaluates 5' and 3' splice site strength employing the Maximum Entropy Principle.

E1659A: N/A — Not splice-affecting (missense variant)
API FREE

NNSplice

TESTED

Neural network approach to locating consensus splice sites in primary DNA sequence.

E1659A: N/A — Not splice-affecting (missense variant)
FREE

Pangolin

TESTED

Predicts splice site strength and aberrant usage across multiple mammalian tissues.

API FREE

SPANR

DOWN

Predicts the percentage of spliced-in (PSI) events across different tissues.

API FREE

SPiP

TESTED

Bioinformatics pipeline utilizing statistical thresholds for deep splicing analysis.

E1659A: N/A — Not splice-affecting (missense variant)
API FREE

SpliceAI

TESTED

Predicts splice site creation/disruption from DNA sequence

E1659A: Low — Missense, not splice-disrupting
FREE

SpliceRover

BROKEN

Deep convolutional neural network for splice site prediction in whole genomes.

FREE

Regulatory & Non-Coding

10 tools

Population Databases

13 tools

ALFA

TESTED

Allele Frequency Aggregator analyzing dbSNP data across diverse populations.

API FREE

BRAVO / TOPMed

VERIFIED

Variant browser providing allele frequencies for over 868 million variants from whole genomes.

E1659A: Not found — Absent from TOPMed
API FREE

Biobank Japan

TESTED

Prospective genome biobank offering summary statistics for ~260,000 Japanese individuals.

E1659A: Not found — Absent from Japanese population GWAS
API FREE

CMDB

DOWN

High-quality database containing 9.04 million SNVs from 141,431 healthy Chinese individuals.

API FREE

ClinGen

VERIFIED

Gene-disease validity curation (is STRC definitively linked to hearing loss?)

FREE

ExAC

TESTED

Historical exome aggregation consortium (largely superseded by gnomAD).

API FREE

KoB KDNA

TESTED

The National Project of Bio-Big Data in South Korea, projecting 1 million sequenced genomes.

E1659A: Not found — Absent from Korean population data
FREE

LOVD

VERIFIED

Leiden Open Variation Database providing locus-specific gene variant data.

E1659A: N/A — STRC variants present, E1659A absent
API FREE

SAGE

DOWN

Comprehensive repertoire integrating 154 million genetic variants from South Asians.

FREE

UK Biobank

TESTED

Massive biomedical database containing over 500,000 sequenced genomes.

API

dbSNP

TESTED

Foundational archive for single nucleotide polymorphisms and multiple small-scale variations.

E1659A: N/A — No rsID assigned
API FREE

gnomAD

VERIFIED

Population allele frequencies across diverse ancestries

E1659A: Not found — Absent from 251K controls (PM2)
FREE

seqr

VERIFIED

Variant search across 70K+ rare disease cases

Clinical Databases

7 tools

Gene-Level Resources

4 tools

Hearing Loss & Inner Ear

9 tools

Conservation & Evolution

5 tools

Structural Variants & CNV

7 tools

Nomenclature & Validation

3 tools

Literature Mining

8 tools

Workflow Platforms

1 tools

Research Impact

143
Computational tools
16
AlphaFold 3 experiments
$50-100
Total AI cost for full analysis

All tools are designed for independent research. Most databases are free with academic access. The complete methodology is documented to enable reproduction by any family facing similar variant uncertainty.