Quick Start Guide: transfer
#
This guide walks you through the essential steps for using the transfer
subcommand to fine-tune your own OpenSpliceAI model. By leveraging a pre-trained model (for example, a human-trained model), you can convert HDF5 datasets - generated via the create-data
subcommand - into a tailored deep learning model for splice site prediction.
Before You Begin#
Installation: Follow the instructions on the Installation page to install OpenSpliceAI along with all necessary dependencies.
Pre-trained Model: Obtain a pre-trained OpenSpliceAI model in
.pt
format (e.g., model_10000nt_rs10.pt ).Dataset Preparation: Generate the training and testing datasets for your species of interest using the
create-data
subcommand. See the Quick Start Guide: create-data guide for details. You will needdataset_train.h5
anddataset_test.h5
.Check Example Scripts: We provide an example script examples/transfer/transfer_example.sh
One-liner Start#
If you have already generated the necessary files with the create-data
subcommand — or if you prefer to download them directly from GitHub—proceed with the steps below:
Training Dataset: dataset_train.h5
Testing Dataset: dataset_test.h5
Pre-trained Model: Download a pre-trained model from the OpenSpliceAI GitHub repository. For example: model_10000nt_rs10.pt
Execute the following command to initiate transfer learning:
openspliceai transfer \
--train-dataset dataset_train.h5 \
--test-dataset dataset_test.h5 \
--pretrained-model model_10000nt_rs10.pt \
--flanking-size 10000 \
--unfreeze-all \
--epochs 10 \
--early-stopping \
--project-name new_species_transfer \
--output-dir ./transfer_out/
This command will:
Load the pre-trained model (
model_10000nt_rs10.pt
).Unfreeze all layers (using
--unfreeze-all
).Fine-tuning the model on your custom dataset for 10 epochs, saving logs and checkpoints in the
transfer_out/
directory.
Note
Please note that the model transfer-learned in this experiment is not optimized for splice site prediction, as it was fine-tuned only on a small subset of the data. This example is intended solely to demonstrate the transfer-learning process. For a fully optimized, pre-trained model, please refer to the Released OpenSpliceAI models guide.
Next Steps#
After completing transfer learning, consider the following actions:
Explore ``transfer`` Options: Review the transfer documentation to discover additional customization options for your transfer-learning process.
Calibration (Optional): Enhance the reliability of your model’s probability outputs by following the guidelines in the Quick Start Guide: calibrate guide.
Prediction: To deploy your newly trained model for splice site prediction, see the Quick Start Guide: predict guide.
Advanced Options: Experiment with further training parameters (such as adjusting the number of epochs or the patience value) to optimize model performance.

