Calibration Toolbox Documentation
Calibration Toolbox is a Python library for evaluating machine learning model calibration using binning-based metrics.
The package provides a comprehensive collection of calibration metrics and visualization tools for research in deep learning and uncertainty quantification.
Overview
Calibration Toolbox focuses on binning-based calibration metrics, which are widely used in machine learning research to evaluate how well predicted probabilities match actual outcomes.
Key Features
Comprehensive Metrics: ECE, MCE, RMSCE, ACE, SCE, and more
General Calibration Error (GCE): Flexible framework for computing various calibration metrics
Framework-Agnostic: Works with NumPy arrays - no PyTorch or TensorFlow required
Visualization Tools: Reliability diagrams, confidence histograms, and class-wise calibration curves
Well-Tested: Extensive test coverage with edge case handling
Research-Oriented: Implementations based on recent research papers
Quick Example
import numpy as np
from calibration_toolbox import expected_calibration_error, reliability_diagram
# Your model's predicted probabilities
probabilities = np.array([[0.8, 0.2], [0.6, 0.4], [0.9, 0.1]])
labels = np.array([0, 1, 0])
# Compute Expected Calibration Error
ece = expected_calibration_error(probabilities, labels)
print(f"ECE: {ece:.4f}")
# Visualize calibration
reliability_diagram(probabilities, labels)
Installation
Install from PyPI (once published):
pip install calibration-toolbox
Or install from source:
git clone https://github.com/Jonathan-Pearce/calibration-toolbox.git
cd calibration-toolbox
pip install -e .