Quick Start Guide#

OpenSpliceAI offers three primary workflows:

Predict: Use pre-trained models to directly predict splice sites from DNA sequences. No training is required.
Train from Scratch: Build your own model by creating datasets and training from scratch.
Transfer Learning: Adapt an existing (e.g., human-trained) model to a new species or dataset.

The following sections provide a concise, step-by-step guide for each workflow.

Usage 1 – Predict#

Quick Start: Predict with Pre-trained Models

In this workflow, you use a pre-trained OpenSpliceAI model to generate splice site predictions from your FASTA (and optionally GFF) files. This is ideal for users who want to quickly obtain splice site annotations without the need to train a model.

Usage 2 – Train from Scratch#

Quick Start: Train Your Own Model

This workflow guides you through creating datasets from genomic sequences and annotations, training a SpliceAI model from scratch, optionally calibrating the model, and finally running predictions and variant analyses. It is best suited for users who want to build a custom model tailored to their data.

Usage 3 – Transfer Learning#

Quick Start: Transfer Learning Across Species

This workflow enables you to adapt a pre-trained model (such as a human-trained model) to your target species using transfer learning. It involves generating datasets, fine-tuning the model, and then performing predictions and variant annotation. This approach is recommended when working with species for which limited training data is available.