I asked AI what that means. Came up with a variant pathogenicity analysis and a new hypothesis.
Michael is 4. He doesn't hear well. Two broken copies of the STRC gene. One confirmed pathogenic. The other: "Variant of Uncertain Significance." Three words that block him from gene therapy trials.
I'm not a geneticist. I build websites, shoot video, and do AI education. I have an AI agent (OpenClaw, powered by Claude Opus 4.6) running on my laptop. It searches databases, downloads protein structures, runs analysis. I ask questions from my phone while Michael plays next to me.
One question led to reclassification evidence. Then conservation analysis. Then a hypothesis about fitting the gene into a single therapy vector. Then six structural experiments. Then three emails to the scientists who pioneered this research. One responded overnight.
Science shouldn't be locked behind jargon. There's a podcast and a video below (both AI-generated) for anyone who'd rather listen than scroll through protein structures.
Egor and Michael, Hong Kong
Computational evidence supporting VUS to Likely Pathogenic reclassification for NM_153700.2:c.4976A>C p.(Glu1659Ala)
Every possible amino acid substitution at position 1659 is predicted Likely Pathogenic. This position is structurally invariant: any change breaks the protein.
E1659 is 100% conserved across all tested mammals, spanning ~80 million years of evolution. The surrounding motif PEIFTEIGTIAAG is identical in every species.
| Species | Position | Residue | Context |
|---|---|---|---|
| Human | 1659 | E | PEIFTEIGTIAAG |
| Mouse | 1693 | E | PEIFTEIGTIAAG |
| Rat | 1693 | E | PEIFTEIGTIAAG |
| Cow | 1647 | E | PEIFTEIGTIAAG |
| Green monkey | 1659 | E | PEIFTEIGTIAAG |
| Pig | 1650 | E | PEIFTEIGTIAAG |
| Dog | 1649 | E | PEIFTEIGTIAAG |
| Bat | 1646 | E | PEIFTEIGTIAAG |
| Bear | 1643 | E | PEIFTEIGTIAAG |
9/9 species conserve Glutamic acid (E) at this position. The surrounding 13-residue motif is identical across all tested mammals. This level of conservation strongly suggests functional importance and supports pathogenicity of any substitution (PP1 Supporting evidence per ACMG). Data source: UniProt ortholog sequences, motif-based alignment.
Stereocilin (Q7RTU9, 1775 aa) from AlphaFold v6. Position E1659 highlighted in magenta. Drag to rotate, scroll to zoom.
Color: pLDDT confidence (blue=high, red=low)
Glutamic acid side chain shown as sticks
STRC has a nearly identical pseudogene (STRCP1) located adjacent on chromosome 15q15.3. This causes most standard computational tools to fail or return unreliable results for STRC variants:
AlphaMissense is uniquely valuable for STRC because it predicts pathogenicity from protein structure, bypassing the sequence-alignment step where pseudogene STRCP1 causes other tools to fail. REVEL (0.65) also provides a concordant prediction, using an ensemble approach that partially mitigates this issue.
| Criterion | Strength | Evidence |
|---|---|---|
| PM3 | Moderate | Detected in trans with pathogenic whole-gene deletion (confirmed paternal) |
| PP3_Moderate | Moderate | AlphaMissense 0.9016 + REVEL 0.65 concordant (Pejaver 2022 threshold) |
| PM2_Supporting | Supporting | Absent from gnomAD (0 alleles in 251,000+ individuals) |
| PP1_Supporting | Supporting | E1659 100% conserved across 9 mammalian species (~80M years). Identical motif PEIFTEIGTIAAG. |
2 Moderate + 2 Supporting = Likely Pathogenic per ACMG/AMP 2015 combining rules
Computational hypotheses for accelerating STRC gene therapy. These require experimental validation.
Current STRC gene therapy requires two AAV vectors because the gene (5325 bp) exceeds the single-AAV packaging limit (~4400 bp usable). AlphaFold structural analysis suggests a single-vector approach may be possible.
AlphaFold predicts stereocilin's structure with varying confidence along the protein. The N-terminal region (residues 1-615) has very low confidence (pLDDT < 50), indicating it is likely intrinsically disordered with no stable 3D structure. The functional core starts around residue 616.
All regions have pLDDT < 50 (no stable structure predicted)
3984 bp fits in single AAV (<4400 bp limit)
This approach has proven precedent. The dystrophin gene (11,000 bp) was too large for any AAV. Researchers created "micro-dystrophin" by removing non-essential spectrin-like repeats, fitting it into a single AAV. This is now in Phase 3 clinical trials (Sarepta SRP-9001). The same principle: identify the structural core, remove disordered/redundant regions, preserve function. Nobody has tried this for STRC yet.
Important: This is a computational hypothesis based on AlphaFold structural predictions. It requires experimental validation: does mini-stereocilin fold correctly? Does it localize to stereocilia tips? Does it form horizontal top connectors and tectorial membrane attachments? These questions need wet-lab work. But the structural data strongly suggests the N-terminal region is dispensable, and a single-AAV mini-STRC approach deserves investigation.
Systematic computational testing of the mini-STRC hypothesis and variant impact. 3D models rendered live from AlphaFold 3 CIF files. Drag to rotate, scroll to zoom.
Low confidence interaction. Best cross-chain PAE: 8.6 A at N-terminal.
N-terminal removal barely affects binding (0.43 vs 0.47). Confirms dispensable.
No structural damage. The fold is intact. E1659A affects function (charge loss), not structure.
Full protein. N-terminal drags score down (16% disordered).
Truncated protein folds excellently. 7% disordered. Key result.
Confirmed disordered. 38% unstructured. Safe to remove.
Mini-STRC (without N-terminal) achieves pTM 0.81, significantly better than full-length wildtype (pTM 0.63). The removed N-terminal region scores only pTM 0.27 with 38% disorder. Removing the disordered N-terminal produces a better-folding protein that fits in a single AAV vector.
Instead of replacing the entire STRC gene (5325 bp), what if we could correct just the single mutated base? There are three types of gene editing tools. I checked each one against Michael's specific variant.
Prime editing requires a "landing pad" (PAM site, NGG sequence) near the target. I downloaded the genomic sequence from Ensembl REST API and searched for NGG motifs within 15 bp of the variant.
Reality check: Prime editing has not been tested in inner ear hair cells in vivo. Delivering the prime editor + guide RNA to outer hair cells deep in the cochlea is an unsolved challenge. But this analysis confirms that Michael's specific variant is technically targetable. If delivery is solved (an active area of research), this mutation can be corrected at the DNA level.
No genetics degree. No lab access. No budget. Just a laptop, a phone, and an AI agent (OpenClaw + Claude Opus 4.6) that can actually do things: download files, search databases, parse data, build sites. My job was asking questions. Good ones. Here's exactly what I asked and what came back.
Michael's WES report from Hong Kong Children's Hospital (Lab No: 23C7500174, December 2022) listed two STRC variants. One was labeled "Pathogenic" (a whole gene deletion from his father, confirmed by MLPA). The other was labeled "Variant of Uncertain Significance" (a single letter change from his mother, confirmed by Sanger sequencing): NM_153700.2:c.4976A>C p.(Glu1659Ala). I needed to know: is this second variant actually harmful?
I asked Claude to look up the STRC protein. It searched UniProt and found the ID: Q7RTU9. Claude then pointed me to AlphaFold, which has the predicted 3D structure. The confidence score (pLDDT) at position 1659 was 95.69 out of 100, meaning the structure prediction at this spot is very reliable.
AlphaMissense is a tool by Google DeepMind that predicts whether a protein mutation is harmful. Claude downloaded the AlphaMissense predictions file for stereocilin and searched for "E1659A" (E = Glutamic acid, the original amino acid; A = Alanine, Michael's variant).
The result: 0.9016 out of 1.0 (Likely Pathogenic). Anything above 0.564 is considered likely harmful. I then checked all 19 other possible changes at position 1659. Every single one scored above 0.846. This means position 1659 is structurally critical: any change there breaks the protein.
| protein_variant | am_pathogenicity | am_class |
| E1659A | 0.9016 | LPath |
| E1659D | 0.9483 | LPath |
| E1659G | 0.9191 | LPath |
| ... all 19 substitutions: LPath (0.846-0.999) | ||
If a position is important for the protein, it should be the same amino acid across different species. Claude pulled stereocilin sequences from 9 mammals on UniProt (human, mouse, rat, cow, monkey, pig, dog, bat, bear) and searched for the motif around position 1659 in each.
Result: 100% conserved. All 9 species have Glutamic acid (E) at this position. The surrounding 13-residue motif (PEIFTEIGTIAAG) is identical across ~80 million years of evolution. This is PP1 Supporting evidence under ACMG criteria.
Normally, geneticists use SIFT, PolyPhen-2, and CADD to check variants. Claude tried all three through the Ensembl VEP API. They all returned nothing for this variant.
The reason: STRC has a nearly identical "twin" gene next to it on chromosome 15 (a pseudogene called STRCP1) that confuses sequence-alignment-based tools. This is why AlphaMissense is uniquely important for STRC: it works from the protein's 3D structure, not from the DNA sequence, so the pseudogene doesn't affect it.
ACMG/AMP guidelines (Richards et al., 2015) are the standard framework geneticists use to classify variants. Each piece of evidence gets a code and strength level. I learned the rules and applied them:
2 Moderate + 2 Supporting = Likely Pathogenic. Per ACMG combining rules (Table 5), this meets the threshold for Likely Pathogenic classification.
I compiled all evidence into a formal letter addressed to the Chemical Pathology Laboratory at Hong Kong Children's Hospital, requesting a reclassification review of the variant from VUS to Likely Pathogenic. I attached the AlphaMissense data, conservation analysis, and ACMG criteria breakdown. I also built this website so the evidence is transparent, reproducible, and accessible to anyone reviewing the case.
If the hospital accepts the reclassification, Michael's molecular diagnosis will be confirmed: biallelic pathogenic STRC (DFNB16). This is a prerequisite for future gene therapy clinical trials. Dual-AAV gene therapy has already restored hearing in STRC-deficient mice (Iranfar et al., January 2026). Human trials are expected within 2-3 years. Michael will be 7-8 years old.
Reclassification is the immediate goal. But once you start asking questions, you can't stop. Can we make the gene smaller? Fix just one letter? What if we test it computationally before anyone spends a dollar on a lab? These aren't genius insights. They're obvious questions. The difference is having an AI agent that can actually go look for the answers.
Instead of replacing the whole gene, what if we could fix just the one wrong letter? Claude downloaded the genomic sequence around Michael's variant from Ensembl and checked whether gene editing tools could target it.
Base editing (CBE/ABE): cannot fix this variant (C>A transversion is outside their range). Prime editing: feasible. Claude found a suitable PAM site just 4 base pairs from the mutation. A prime editor could theoretically correct the single base change, though this approach has not yet been tested in inner ear cells.
Current gene therapy for STRC requires two viruses (dual-AAV) because the gene is too long for one. Two viruses means lower efficiency: both must enter the same cell. Claude analyzed the AlphaFold structure and identified that the first ~600 amino acids have very low structural confidence (pLDDT below 50), suggesting they may not form a stable structure and might be dispensable.
If those regions are removed, the remaining "mini-stereocilin" (1328 aa, 3984 bp) fits in a single AAV vector. This is a computational hypothesis. It needs lab testing. But the precedent exists: micro-dystrophin (removing non-essential parts of dystrophin) is now in Phase 3 clinical trials for muscular dystrophy.
To test the mini-STRC idea further, We submitted a job to AlphaFold 3 Server to predict the 3D structure of stereocilin bound to its interaction partner TMEM145 (a protein recently discovered to be essential for stereocilin's function, Nature Communications 2025).
First results received (Job 1). ipTM = 0.47, pTM = 0.48. Low confidence in direct binding. PAE matrix analysis shows best cross-chain contacts at N-terminal residues 174-185 (but still poor at 8.6 A).
I then submitted 5 more jobs to systematically test the mini-STRC hypothesis:
| # | Experiment | Status | Tests |
|---|---|---|---|
| 1 | Full STRC + TMEM145 | Done (ipTM 0.47) | Baseline interaction |
| 2 | Mini-STRC + TMEM145 | Done (ipTM 0.43) | N-terminal dispensable (0.43 vs 0.47 baseline) |
| 3 | STRC E1659A mutant (solo) | In progress | Does Misha's mutation break the fold? |
| 4 | STRC wildtype (solo) | Done (pTM 0.63) | Baseline: 16% disordered (N-term drags it down) |
| 5 | Mini-STRC solo | Done (pTM 0.81) | YES! Mini-STRC folds excellently (7% disordered) |
| 6 | N-terminal solo (1-615) | Done (pTM 0.27) | CONFIRMED: 38% disordered, pTM 0.27 |
I emailed the leading researchers working on STRC gene therapy at institutions in the US, France, and China. I shared the reclassification evidence, the mini-STRC hypothesis, and a link to this website.
I received encouraging responses confirming the computational approach is sound and that the analysis has been shared with research teams working on STRC gene therapy.
OpenClaw is open source (free). Claude Opus 4.6 is available via Anthropic API (pay per use). All scientific databases are free and open access. AlphaFold 3 Server requires a Google account.