Visualization Module

The visualization module provides plotting functions for calibration analysis.

Visualization tools for calibration analysis.

This module provides plotting functions for visualizing model calibration, including reliability diagrams and confidence histograms.

calibration_toolbox.visualization.reliability_diagram(probabilities, labels, n_bins=15, logits=False, title=None, figsize=(6, 6), return_fig=False)[source]

Plot a reliability diagram (calibration curve).

A reliability diagram visualizes the relationship between predicted confidence and actual accuracy across bins. Well-calibrated models should have points close to the diagonal identity line.

Args:
probabilities: Array of shape (n_samples, n_classes) containing

predicted probabilities for each class.

labels: Array of shape (n_samples,) containing true class labels. n_bins: Number of bins for grouping predictions. Default: 15. logits: If True, input is logits and will be converted to probabilities.

Default: False.

title: Plot title. If None, uses default title. Default: None. figsize: Figure size as (width, height). Default: (6, 6). return_fig: If True, return figure and axis objects. Default: False.

Returns:

If return_fig is True, returns (fig, ax) tuple. Otherwise, displays plot.

Example:
>>> probs = np.array([[0.8, 0.2], [0.6, 0.4], [0.9, 0.1]])
>>> labels = np.array([0, 1, 0])
>>> reliability_diagram(probs, labels)
Parameters:
calibration_toolbox.visualization.confidence_histogram(probabilities, labels, n_bins=15, logits=False, title=None, figsize=(6, 6), return_fig=False)[source]

Plot a confidence histogram showing the distribution of model confidences.

The histogram shows how confident the model is across predictions, with vertical lines indicating average accuracy and average confidence.

Args:
probabilities: Array of shape (n_samples, n_classes) containing

predicted probabilities for each class.

labels: Array of shape (n_samples,) containing true class labels. n_bins: Number of bins for the histogram. Default: 15. logits: If True, input is logits and will be converted to probabilities.

Default: False.

title: Plot title. If None, uses default title. Default: None. figsize: Figure size as (width, height). Default: (6, 6). return_fig: If True, return figure and axis objects. Default: False.

Returns:

If return_fig is True, returns (fig, ax) tuple. Otherwise, displays plot.

Example:
>>> probs = np.array([[0.8, 0.2], [0.6, 0.4], [0.9, 0.1]])
>>> labels = np.array([0, 1, 0])
>>> confidence_histogram(probs, labels)
Parameters:
calibration_toolbox.visualization.class_wise_calibration_curve(probabilities, labels, n_bins=15, logits=False, title=None, figsize=(8, 6), max_classes=10, return_fig=False)[source]

Plot class-wise calibration curves.

Shows calibration curves for each class separately, useful for understanding per-class calibration behavior.

Args:
probabilities: Array of shape (n_samples, n_classes) containing

predicted probabilities for each class.

labels: Array of shape (n_samples,) containing true class labels. n_bins: Number of bins for grouping predictions. Default: 15. logits: If True, input is logits and will be converted to probabilities.

Default: False.

title: Plot title. If None, uses default title. Default: None. figsize: Figure size as (width, height). Default: (8, 6). max_classes: Maximum number of classes to plot. If None, plot all.

Default: 10.

return_fig: If True, return figure and axis objects. Default: False.

Returns:

If return_fig is True, returns (fig, ax) tuple. Otherwise, displays plot.

Example:
>>> probs = np.array([[0.8, 0.15, 0.05], [0.6, 0.3, 0.1]])
>>> labels = np.array([0, 1])
>>> class_wise_calibration_curve(probs, labels)
Parameters:
calibration_toolbox.visualization.calibration_error_decomposition(probabilities, labels, n_bins=15, logits=False, figsize=(10, 6), return_fig=False)[source]

Plot a comparison of different calibration error metrics.

Creates a bar chart comparing ECE, MCE, RMSCE, ACE, and SCE for the given predictions.

Args:
probabilities: Array of shape (n_samples, n_classes) containing

predicted probabilities for each class.

labels: Array of shape (n_samples,) containing true class labels. n_bins: Number of bins for computing metrics. Default: 15. logits: If True, input is logits and will be converted to probabilities.

Default: False.

figsize: Figure size as (width, height). Default: (10, 6). return_fig: If True, return figure and axis objects. Default: False.

Returns:

If return_fig is True, returns (fig, ax) tuple. Otherwise, displays plot.

Example:
>>> probs = np.array([[0.8, 0.2], [0.6, 0.4], [0.9, 0.1]])
>>> labels = np.array([0, 1, 0])
>>> calibration_error_decomposition(probs, labels)
Parameters:

Plot Functions

calibration_toolbox.visualization.reliability_diagram(probabilities, labels, n_bins=15, logits=False, title=None, figsize=(6, 6), return_fig=False)[source]

Plot a reliability diagram (calibration curve).

A reliability diagram visualizes the relationship between predicted confidence and actual accuracy across bins. Well-calibrated models should have points close to the diagonal identity line.

Args:
probabilities: Array of shape (n_samples, n_classes) containing

predicted probabilities for each class.

labels: Array of shape (n_samples,) containing true class labels. n_bins: Number of bins for grouping predictions. Default: 15. logits: If True, input is logits and will be converted to probabilities.

Default: False.

title: Plot title. If None, uses default title. Default: None. figsize: Figure size as (width, height). Default: (6, 6). return_fig: If True, return figure and axis objects. Default: False.

Returns:

If return_fig is True, returns (fig, ax) tuple. Otherwise, displays plot.

Example:
>>> probs = np.array([[0.8, 0.2], [0.6, 0.4], [0.9, 0.1]])
>>> labels = np.array([0, 1, 0])
>>> reliability_diagram(probs, labels)
Parameters:
calibration_toolbox.visualization.confidence_histogram(probabilities, labels, n_bins=15, logits=False, title=None, figsize=(6, 6), return_fig=False)[source]

Plot a confidence histogram showing the distribution of model confidences.

The histogram shows how confident the model is across predictions, with vertical lines indicating average accuracy and average confidence.

Args:
probabilities: Array of shape (n_samples, n_classes) containing

predicted probabilities for each class.

labels: Array of shape (n_samples,) containing true class labels. n_bins: Number of bins for the histogram. Default: 15. logits: If True, input is logits and will be converted to probabilities.

Default: False.

title: Plot title. If None, uses default title. Default: None. figsize: Figure size as (width, height). Default: (6, 6). return_fig: If True, return figure and axis objects. Default: False.

Returns:

If return_fig is True, returns (fig, ax) tuple. Otherwise, displays plot.

Example:
>>> probs = np.array([[0.8, 0.2], [0.6, 0.4], [0.9, 0.1]])
>>> labels = np.array([0, 1, 0])
>>> confidence_histogram(probs, labels)
Parameters:
calibration_toolbox.visualization.class_wise_calibration_curve(probabilities, labels, n_bins=15, logits=False, title=None, figsize=(8, 6), max_classes=10, return_fig=False)[source]

Plot class-wise calibration curves.

Shows calibration curves for each class separately, useful for understanding per-class calibration behavior.

Args:
probabilities: Array of shape (n_samples, n_classes) containing

predicted probabilities for each class.

labels: Array of shape (n_samples,) containing true class labels. n_bins: Number of bins for grouping predictions. Default: 15. logits: If True, input is logits and will be converted to probabilities.

Default: False.

title: Plot title. If None, uses default title. Default: None. figsize: Figure size as (width, height). Default: (8, 6). max_classes: Maximum number of classes to plot. If None, plot all.

Default: 10.

return_fig: If True, return figure and axis objects. Default: False.

Returns:

If return_fig is True, returns (fig, ax) tuple. Otherwise, displays plot.

Example:
>>> probs = np.array([[0.8, 0.15, 0.05], [0.6, 0.3, 0.1]])
>>> labels = np.array([0, 1])
>>> class_wise_calibration_curve(probs, labels)
Parameters:
calibration_toolbox.visualization.calibration_error_decomposition(probabilities, labels, n_bins=15, logits=False, figsize=(10, 6), return_fig=False)[source]

Plot a comparison of different calibration error metrics.

Creates a bar chart comparing ECE, MCE, RMSCE, ACE, and SCE for the given predictions.

Args:
probabilities: Array of shape (n_samples, n_classes) containing

predicted probabilities for each class.

labels: Array of shape (n_samples,) containing true class labels. n_bins: Number of bins for computing metrics. Default: 15. logits: If True, input is logits and will be converted to probabilities.

Default: False.

figsize: Figure size as (width, height). Default: (10, 6). return_fig: If True, return figure and axis objects. Default: False.

Returns:

If return_fig is True, returns (fig, ax) tuple. Otherwise, displays plot.

Example:
>>> probs = np.array([[0.8, 0.2], [0.6, 0.4], [0.9, 0.1]])
>>> labels = np.array([0, 1, 0])
>>> calibration_error_decomposition(probs, labels)
Parameters: