Quick Start Guide: calibrate
#
This guide provides a brief walkthrough for using the calibrate
subcommand in OpenSpliceAI to adjust your trained model’s probability outputs via temperature scaling.
Note
Calibration is an optional step that can further enhance the reliability of model predictions. Our research demonstrates that OpenSpliceAI’s output probabilities generally reflect real-world likelihoods. Nevertheless, running this calibration step can serve as a useful double-check, generating reliability curves and score distribution plots for your review.
Before You Begin#
Ensure you have the following prerequisites:
Pre-trained Model: Obtain a pre-trained OpenSpliceAI model in
.pt
format (for example, model_10000nt_rs10.pt ) from either the train or transfer.Test/Validation Dataset: Prepare an HDF5 file generated from a test set (e.g.,
dataset_test.h5
) containing sequences and labels that were not used during training.Check Example Scripts: We provide an example script examples/calibrate/calibrate_example.sh
Quick Start#
Testing Dataset: Download the test dataset: dataset_test.h5
Pre-trained Model: Download the pre-trained model from the OpenSpliceAI GitHub repository. For example: model_10000nt_rs10.pt
Run the following command to start the calibration process:
openspliceai calibrate \
--pretrained-model model_10000nt_rs10.pt \
--test-dataset dataset_test.h5 \
--flanking-size 10000 \
--output-dir ./calibration_results/
Key Steps in Calibration#
Model Loading: The pre-trained model is loaded and a temperature parameter (\(T\)) is introduced.
Temperature Optimization: The parameter \(T\) is optimized to better align the predicted probabilities with observed outcomes, thus improving calibration.
Output Generation: An optimized temperature parameter is saved to a
temperature.pt
file, and calibration plots (e.g., reliability curves) are generated in thecalibration_results/
directory.
Next Steps#
Explore Calibration Options: For more details on available arguments and further customization, refer to the calibrate documentation.
Prediction: Apply your newly calibrated model to generate more reliable probability estimates by following the predict guide.

