Quick Start Guide: variant
#
This page summarizes how to use OpenSpliceAI's variant
subcommand to assess the impact of genomic variants on splice sites.
Before You Begin#
Install OpenSpliceAI: Ensure you have installed OpenSpliceAI and its dependencies as described in the Installation page.
Check Example Scripts: We provide an example script examples/variant/variant.sh
- Prepare you input files:
VCF File: A variant call format file containing SNPs or small INDELs.
Reference Genome (FASTA): Must match the reference used in the VCF.
Annotation File: Gene annotations to filter variants by genomic region.
Trained Model: One or more OpenSpliceAI model checkpoints (PyTorch or Keras).
One-liner Start#
Variants:
input.vcf
Reference FASTA:
hg19.fa
Annotation File:
grch37.txt
- A pre-trained OpenSpliceAI model or directory of models:
Run:
openspliceai variant \
-R data/ref_genome/homo_sapiens/GRCh37/hg19.fa \
-A examples/data/grch37.txt \
-m models/spliceai-mane/400nt/ \
-f 400 \
-t pytorch \
-I data/vcf/input.vcf \
-O examples/variant/output.vcf
This command:
Loads the VCF variants and checks them against the reference genome.
Predicts donor/acceptor scores for both wild-type and mutant sequences within ±50 nt.
Outputs an annotated VCF (
output.vcf
) with delta scores and positions for donor/acceptor gain or loss.
Next Steps#
Review: Inspect the appended INFO fields in the VCF for delta scores and their positions.
Further Analysis: Filter or rank variants by largest delta scores to prioritize functional splicing impacts.
Congratulations
Congratulations! You have gone through all subcommands of OpenSpliceAI.
Check out all the released models at Released OpenSpliceAI models or
Follow the steps in Train your own OpenSpliceAI Models to train your own OpenSpliceAI models.

