cellmap_segmentation_challenge.utils.eval_utils package#
Submodules#
cellmap_segmentation_challenge.utils.eval_utils.aggregation module#
Score aggregation and result processing utilities.
- cellmap_segmentation_challenge.utils.eval_utils.aggregation.combine_scores(scores, include_missing=True, instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'], cast_to_none=[nan, inf, -inf, inf, -inf])[source]#
Combine scores across volumes, normalizing by the number of voxels.
- Parameters:
scores (dict) – A dictionary of scores for each volume, as returned by score_volume.
include_missing (bool) – Whether to include missing volumes in the combined scores.
instance_classes (list) – A list of instance classes.
cast_to_none (list) – A list of values to cast to None in the combined scores.
- Returns:
A dictionary of combined scores across all volumes.
- Return type:
dict
- Example usage:
combined_scores = combine_scores(scores)
cellmap_segmentation_challenge.utils.eval_utils.array_utils module#
Array manipulation utilities for evaluation.
- cellmap_segmentation_challenge.utils.eval_utils.array_utils.resize_array(arr, target_shape, pad_value=0)[source]#
Resize an array to a target shape by padding or cropping as needed.
- Parameters:
arr (np.ndarray) – Input array to resize.
target_shape (tuple) – Desired shape for the output array.
pad_value (int, float, etc.) – Value to use for padding if the array is smaller than the target shape.
- Returns:
Resized array with the specified target shape.
- Return type:
np.ndarray
cellmap_segmentation_challenge.utils.eval_utils.config module#
Configuration for evaluation pipeline.
- class cellmap_segmentation_challenge.utils.eval_utils.config.EvaluationConfig(*args: Any, max_instance_threads: Any = None, max_semantic_threads: Any = None, **kwargs: Any)[source]#
Bases:
objectConfiguration for evaluation pipeline.
All parameters can be set via environment variables or passed directly. Environment variables take precedence over defaults but not over explicitly passed values.
The legacy keyword arguments
max_instance_threadsandmax_semantic_threadsare accepted for backward constructor compatibility and map tomax_workers. When bothmax_workersand a legacy argument are provided,max_workerstakes precedence.- Parameters:
args (Any)
max_instance_threads (Any)
max_semantic_threads (Any)
kwargs (Any)
- max_workers: int = 4#
- per_instance_threads: int = 25#
- max_distance_cap_eps: float = 0.0001#
- final_instance_ratio_cutoff: float = 10.0#
- initial_instance_ratio_cutoff: float = 50.0#
- instance_ratio_factor: float = 5.0#
- max_overlap_edges: int = 5000000#
- mcmf_cost_scale: int = 1000000#
- truth_path: UPath#
- instance_classes: list[str]#
- cast_to_none: list[Any]#
- property max_instance_threads: int#
use max_workers instead.
- Type:
Deprecated
- property max_semantic_threads: int#
use max_workers instead.
- Type:
Deprecated
- classmethod from_env() EvaluationConfig[source]#
Load configuration from environment variables with defaults.
MAX_WORKERStakes precedence. When it is unset the legacy varsMAX_INSTANCE_THREADSandMAX_SEMANTIC_THREADSare consulted in that order as a fallback (each triggers aDeprecationWarning). If none of the three is set the computed default is used.- Returns:
EvaluationConfig with values from environment or defaults.
- Return type:
- cellmap_segmentation_challenge.utils.eval_utils.config.ratio_cutoff(nG: int, R_base: float = 10.0, R_extra: float = 50.0, k: float = 5.0) float[source]#
Calculate the acceptable ratio cutoff for instance matching.
The ratio cutoff decreases exponentially as the number of ground truth instances increases, allowing for more tolerance with fewer instances.
- Parameters:
nG (int) – Number of ground truth instances
R_base (float) – Base ratio cutoff (minimum)
R_extra (float) – Extra ratio tolerance for small nG
k (float) – Exponential decay factor
- Returns:
Maximum acceptable ratio of predicted to ground truth instances
- Return type:
float
cellmap_segmentation_challenge.utils.eval_utils.distance module#
Distance metrics including Hausdorff distance computation.
- cellmap_segmentation_challenge.utils.eval_utils.distance.compute_max_distance(voxel_size, shape) float[source]#
Compute the maximum distance used for distance-based metrics, based on the maximum distance to a volume boundary.
- Return type:
float
- cellmap_segmentation_challenge.utils.eval_utils.distance.normalize_distance(distance: float, voxel_size) float[source]#
Normalize a distance value to [0, 1] using the maximum distance represented by a voxel
- Parameters:
distance (float)
- Return type:
float
- cellmap_segmentation_challenge.utils.eval_utils.distance.optimized_hausdorff_distances(truth_label, pred_label, voxel_size, hausdorff_distance_max, method='standard', percentile: float | None = None)[source]#
Compute per-truth-instance Hausdorff-like distances against the (already remapped) prediction using multithreading. Returns a 1D float32 numpy array whose i-th entry corresponds to truth_ids[i].
- Parameters:
truth_label (np.ndarray) – Ground-truth instance label volume (0 == background).
pred_label (np.ndarray) – Prediction instance label volume that has already been remapped to align with the GT ids (0 == background).
voxel_size (Sequence[float]) – Physical voxel sizes in Z, Y, X (or Y, X) order.
hausdorff_distance_max (float) – Cap for distances (use np.inf for uncapped).
method ({"standard", "modified", "percentile"}) –
“standard” -> classic Hausdorff (max of directed maxima) “modified” -> mean of directed distances, then max of the two means “percentile” -> use the given percentile of directed distances (requires
percentile to be provided).
percentile (float | None) – Percentile (0-100) used when method==”percentile”.
- cellmap_segmentation_challenge.utils.eval_utils.distance.bbox_for_label(stats: StatisticsDict | StatisticsSlicesDict, ndim: int, label_id: int)[source]#
Try to get bbox without allocating a full boolean mask using cc3d statistics. Falls back to mask-based bbox if cc3d doesn’t provide expected fields. Returns (mins, maxs) inclusive-exclusive in voxel indices, or None if missing.
- Parameters:
stats (StatisticsDict | StatisticsSlicesDict)
ndim (int)
label_id (int)
- cellmap_segmentation_challenge.utils.eval_utils.distance.roi_slices_for_pair(truth_stats: StatisticsDict | StatisticsSlicesDict, pred_stats: StatisticsDict | StatisticsSlicesDict, tid: int, voxel_size, ndim: int, shape: tuple[int, ...], max_distance: float)[source]#
ROI = union(bbox(truth==tid), bbox(pred==tid)) padded by P derived from max_distance. Returns tuple of slices suitable for numpy indexing.
- Parameters:
truth_stats (StatisticsDict | StatisticsSlicesDict)
pred_stats (StatisticsDict | StatisticsSlicesDict)
tid (int)
ndim (int)
shape (tuple[int, ...])
max_distance (float)
- cellmap_segmentation_challenge.utils.eval_utils.distance.compute_hausdorff_distance_roi(truth_label: ndarray, truth_stats: StatisticsDict | StatisticsSlicesDict, pred_label: ndarray, pred_stats: StatisticsDict | StatisticsSlicesDict, tid: int, voxel_size, max_distance: float, method: str = 'standard', percentile: float | None = None)[source]#
Same metric as compute_hausdorff_distance(), but operates on an ROI slice and builds masks only inside ROI.
- Parameters:
truth_label (ndarray)
truth_stats (StatisticsDict | StatisticsSlicesDict)
pred_label (ndarray)
pred_stats (StatisticsDict | StatisticsSlicesDict)
tid (int)
max_distance (float)
method (str)
percentile (float | None)
cellmap_segmentation_challenge.utils.eval_utils.exceptions module#
Custom exceptions for evaluation pipeline.
- exception cellmap_segmentation_challenge.utils.eval_utils.exceptions.EvaluationError[source]#
Bases:
ExceptionBase exception for evaluation errors.
- exception cellmap_segmentation_challenge.utils.eval_utils.exceptions.TooManyInstancesError(n_pred: int, n_gt: int, ratio: float, cutoff: float)[source]#
Bases:
EvaluationErrorRaised when submission has too many instances relative to ground truth.
This is a pathological case where the ratio of predicted to ground truth instances exceeds acceptable thresholds, likely indicating poor segmentation.
- Parameters:
n_pred (int)
n_gt (int)
ratio (float)
cutoff (float)
- exception cellmap_segmentation_challenge.utils.eval_utils.exceptions.TooManyOverlapEdgesError(n_edges: int, max_edges: int)[source]#
Bases:
EvaluationErrorRaised when instance matching produces too many overlap edges.
This indicates computational infeasibility for the matching algorithm.
- Parameters:
n_edges (int)
max_edges (int)
- exception cellmap_segmentation_challenge.utils.eval_utils.exceptions.MatchingFailedError(status: int)[source]#
Bases:
EvaluationErrorRaised when instance matching optimization fails.
- Parameters:
status (int)
- exception cellmap_segmentation_challenge.utils.eval_utils.exceptions.ValidationError[source]#
Bases:
EvaluationErrorRaised when input validation fails.
cellmap_segmentation_challenge.utils.eval_utils.instance_matching module#
Instance matching using min-cost flow optimization.
- class cellmap_segmentation_challenge.utils.eval_utils.instance_matching.InstanceOverlapData(nG: int, nP: int, rows: ndarray, cols: ndarray, iou_vals: ndarray)[source]#
Bases:
objectData structure for instance overlap computation.
- Parameters:
nG (int)
nP (int)
rows (ndarray)
cols (ndarray)
iou_vals (ndarray)
- nG: int#
- nP: int#
- rows: ndarray#
- cols: ndarray#
- iou_vals: ndarray#
- cellmap_segmentation_challenge.utils.eval_utils.instance_matching.match_instances(gt: ndarray, pred: ndarray, config: EvaluationConfig | None = None) dict[int, int][source]#
Match instances between GT and prediction based on IoU.
Uses min-cost flow optimization to find optimal 1:1 matching between predicted and ground truth instances based on IoU overlap.
- Parameters:
gt (ndarray) – Ground truth instance labels (0 = background)
pred (ndarray) – Predicted instance labels (0 = background)
config (EvaluationConfig | None) – Evaluation configuration (uses defaults if None)
- Returns:
Dictionary mapping predicted instance ID to ground truth instance ID. Returns {0: 0} if only background present. Returns {} if no matches found or one side has no instances.
- Raises:
ValidationError – If array shapes don’t match
TooManyInstancesError – If pred/GT ratio exceeds threshold
TooManyOverlapEdgesError – If overlap computation is too large
MatchingFailedError – If optimization fails
- Return type:
dict[int, int]
Example
>>> mapping = match_instances(gt, pred) >>> # Remap predictions to match GT IDs >>> pred_aligned = remap(pred, mapping, preserve_missing_labels=True)
cellmap_segmentation_challenge.utils.eval_utils.scoring module#
Core scoring functions for segmentation evaluation.
- cellmap_segmentation_challenge.utils.eval_utils.scoring.score_instance(pred_label, truth_label, voxel_size, hausdorff_distance_max=None, config: EvaluationConfig | None = None) InstanceScoreDict[source]#
Score instance segmentation against ground truth.
Computes instance F1 score, Hausdorff distance, and combined metrics after optimal instance matching.
- Parameters:
pred_label – Predicted instance labels (0 = background)
truth_label – Ground truth instance labels (0 = background)
voxel_size – Physical voxel size in (Z, Y, X) order
hausdorff_distance_max – Maximum Hausdorff distance cap (None = auto)
config (EvaluationConfig | None) – Evaluation configuration (uses defaults if None)
- Returns:
Dictionary containing all instance segmentation metrics
- Return type:
Example
>>> scores = score_instance(pred, truth, voxel_size=(4.0, 4.0, 4.0)) >>> print(f"Combined score: {scores['combined_score']:.3f}")
- cellmap_segmentation_challenge.utils.eval_utils.scoring.score_semantic(pred_label, truth_label) dict[str, float][source]#
Score a single semantic label volume against the ground truth semantic label volume.
- Parameters:
pred_label (np.ndarray) – The predicted semantic label volume.
truth_label (np.ndarray) – The ground truth semantic label volume.
- Returns:
A dictionary of scores for the semantic label volume.
- Return type:
dict
- Example usage:
scores = score_semantic(pred_label, truth_label)
- cellmap_segmentation_challenge.utils.eval_utils.scoring.score_label(pred_label_path, label_name, crop_name, truth_path='/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/data/ground_truth.zarr', instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'])[source]#
Score a single label volume against the ground truth label volume.
- Parameters:
pred_label_path (str) – The path to the predicted label volume.
truth_path (str) – The path to the ground truth label volume.
instance_classes (list) – A list of instance classes.
- Returns:
A dictionary of scores for the label volume.
- Return type:
dict
- Example usage:
scores = score_label(‘pred.zarr/test_volume/label1’)
- cellmap_segmentation_challenge.utils.eval_utils.scoring.empty_label_score(label, crop_name, instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'], truth_path='/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/data/ground_truth.zarr')[source]#
cellmap_segmentation_challenge.utils.eval_utils.submission module#
Submission processing and main evaluation entry point.
- cellmap_segmentation_challenge.utils.eval_utils.submission.ensure_zgroup(path: UPath) Group | Array[source]#
Ensure that the given path can be opened as a zarr Group. If a .zgroup is not present, add it.
- Parameters:
path (UPath)
- Return type:
Group | Array
- cellmap_segmentation_challenge.utils.eval_utils.submission.get_evaluation_args(volumes, submission_path, truth_path='/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/data/ground_truth.zarr', instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance']) list[tuple][source]#
Get the arguments for scoring each label in the submission. :param volumes: A list of volumes to score. :type volumes: list :param submission_path: The path to the submission volume. :type submission_path: str :param truth_path: The path to the ground truth volume. :type truth_path: str :param instance_classes: A list of instance classes. :type instance_classes: list
- Returns:
A list of tuples containing the arguments for each label to be scored.
- Return type:
list[tuple]
- cellmap_segmentation_challenge.utils.eval_utils.submission.missing_volume_score(truth_path, volume, instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance']) dict[source]#
Score a missing volume as 0’s, congruent with the score_volume function.
- Parameters:
truth_path (str) – The path to the ground truth volume.
volume (str) – The name of the volume.
instance_classes (list) – A list of instance classes.
- Returns:
A dictionary of scores for the volume.
- Return type:
dict
- Example usage:
scores = missing_volume_score(‘truth.zarr/test_volume’)
- cellmap_segmentation_challenge.utils.eval_utils.submission.ensure_valid_submission(submission_path: UPath)[source]#
Ensure that the unzipped submission path is a valid Zarr-2 file.
- Parameters:
submission_path (str) – The path to the unzipped submission Zarr-2 file.
- Raises:
ValueError – If the submission is not a valid unzipped Zarr-2 file.
- cellmap_segmentation_challenge.utils.eval_utils.submission.score_submission(submission_path='/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/data/submission.zip', result_file=None, truth_path='/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/data/ground_truth.zarr', instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'], config: EvaluationConfig | None = None)[source]#
Score a submission against the ground truth data.
This is the main entry point for evaluating a submission. It unzips, validates, scores, and aggregates results for all volumes.
- Parameters:
submission_path – Path to the zipped submission Zarr-2 file
result_file – Path to save the scores (None to skip saving)
truth_path – Path to the ground truth Zarr-2 file
instance_classes – List of instance segmentation classes
config (EvaluationConfig | None) – Evaluation configuration (uses defaults if None)
- Returns:
Dictionary of aggregated scores across all volumes
- Raises:
ValueError – If submission format is invalid
RuntimeError – If volume renaming fails
Example
>>> scores = score_submission('submission.zip', 'results.json') >>> print(f"Overall score: {scores['overall_score']:.4f}")
- Results structure:
- {
- “cropN”: { # Per-volume scores
- “label_name”: {
# Instance segmentation “f1”: float, “hausdorff_distance”: float, “combined_score”: float, # OR semantic segmentation “iou”: float, “dice_score”: float,
}
}, “label_scores”: { # Aggregated per-label
“label_name”: {…}
}, “overall_instance_score”: float, “overall_semantic_score”: float, “overall_score”: float,
}
cellmap_segmentation_challenge.utils.eval_utils.types module#
Type definitions for evaluation metrics.
- class cellmap_segmentation_challenge.utils.eval_utils.types.InstanceScoreDict[source]#
Bases:
TypedDictType definition for instance segmentation scores.
- f1: float#
- tp: int#
- fp: int#
- fn: int#
- binary_accuracy: float#
- hausdorff_distance: float#
- normalized_hausdorff_distance: float#
- combined_score: float#
- iou: float#
- dice_score: float#
- num_voxels: int#
- voxel_size: tuple[float, ...]#
- is_missing: bool#
- status: Literal['scored', 'skipped_too_many_instances', 'missing', 'matching_failed']#
- class cellmap_segmentation_challenge.utils.eval_utils.types.SemanticScoreDict[source]#
Bases:
TypedDictType definition for semantic segmentation scores.
- iou: float#
- dice_score: float#
- binary_accuracy: float#
- num_voxels: int#
- voxel_size: tuple[float, ...]#
- is_missing: bool#
- status: Literal['scored', 'missing']#
cellmap_segmentation_challenge.utils.eval_utils.zip_utils module#
Zip file handling utilities for submissions.
- cellmap_segmentation_challenge.utils.eval_utils.zip_utils.unzip_file(zip_path, max_uncompressed_size: int = 53687091200)[source]#
Unzip a zip file to a specified directory.
Validates against path traversal (zip slip), symlink attacks, and decompression bombs before extracting.
- Parameters:
zip_path (str) – The path to the zip file.
max_uncompressed_size (int) – Maximum total uncompressed size in bytes.
- Raises:
ValueError – If zip_path is a directory rather than a zip file.
ValidationError – If any member fails security checks or total size exceeds limit.
- Example usage:
unzip_file(‘submission.zip’)
Module contents#
Evaluation utilities for cellmap segmentation challenge.
This module provides all evaluation-related functionality including: - Scoring for instance and semantic segmentation - Instance matching using min-cost flow optimization - Hausdorff distance and other metrics - Submission processing and validation
- class cellmap_segmentation_challenge.utils.eval_utils.InstanceScoreDict[source]#
Bases:
TypedDictType definition for instance segmentation scores.
- f1: float#
- tp: int#
- fp: int#
- fn: int#
- binary_accuracy: float#
- hausdorff_distance: float#
- normalized_hausdorff_distance: float#
- combined_score: float#
- iou: float#
- dice_score: float#
- num_voxels: int#
- voxel_size: tuple[float, ...]#
- is_missing: bool#
- status: Literal['scored', 'skipped_too_many_instances', 'missing', 'matching_failed']#
- class cellmap_segmentation_challenge.utils.eval_utils.SemanticScoreDict[source]#
Bases:
TypedDictType definition for semantic segmentation scores.
- iou: float#
- dice_score: float#
- binary_accuracy: float#
- num_voxels: int#
- voxel_size: tuple[float, ...]#
- is_missing: bool#
- status: Literal['scored', 'missing']#
- exception cellmap_segmentation_challenge.utils.eval_utils.EvaluationError[source]#
Bases:
ExceptionBase exception for evaluation errors.
- exception cellmap_segmentation_challenge.utils.eval_utils.TooManyInstancesError(n_pred: int, n_gt: int, ratio: float, cutoff: float)[source]#
Bases:
EvaluationErrorRaised when submission has too many instances relative to ground truth.
This is a pathological case where the ratio of predicted to ground truth instances exceeds acceptable thresholds, likely indicating poor segmentation.
- Parameters:
n_pred (int)
n_gt (int)
ratio (float)
cutoff (float)
- exception cellmap_segmentation_challenge.utils.eval_utils.TooManyOverlapEdgesError(n_edges: int, max_edges: int)[source]#
Bases:
EvaluationErrorRaised when instance matching produces too many overlap edges.
This indicates computational infeasibility for the matching algorithm.
- Parameters:
n_edges (int)
max_edges (int)
- exception cellmap_segmentation_challenge.utils.eval_utils.MatchingFailedError(status: int)[source]#
Bases:
EvaluationErrorRaised when instance matching optimization fails.
- Parameters:
status (int)
- exception cellmap_segmentation_challenge.utils.eval_utils.ValidationError[source]#
Bases:
EvaluationErrorRaised when input validation fails.
- class cellmap_segmentation_challenge.utils.eval_utils.EvaluationConfig(*args: Any, max_instance_threads: Any = None, max_semantic_threads: Any = None, **kwargs: Any)[source]#
Bases:
objectConfiguration for evaluation pipeline.
All parameters can be set via environment variables or passed directly. Environment variables take precedence over defaults but not over explicitly passed values.
The legacy keyword arguments
max_instance_threadsandmax_semantic_threadsare accepted for backward constructor compatibility and map tomax_workers. When bothmax_workersand a legacy argument are provided,max_workerstakes precedence.- Parameters:
args (Any)
max_instance_threads (Any)
max_semantic_threads (Any)
kwargs (Any)
- max_workers: int = 4#
- per_instance_threads: int = 25#
- max_distance_cap_eps: float = 0.0001#
- final_instance_ratio_cutoff: float = 10.0#
- initial_instance_ratio_cutoff: float = 50.0#
- instance_ratio_factor: float = 5.0#
- max_overlap_edges: int = 5000000#
- mcmf_cost_scale: int = 1000000#
- truth_path: UPath#
- instance_classes: list[str]#
- cast_to_none: list[Any]#
- property max_instance_threads: int#
use max_workers instead.
- Type:
Deprecated
- property max_semantic_threads: int#
use max_workers instead.
- Type:
Deprecated
- classmethod from_env() EvaluationConfig[source]#
Load configuration from environment variables with defaults.
MAX_WORKERStakes precedence. When it is unset the legacy varsMAX_INSTANCE_THREADSandMAX_SEMANTIC_THREADSare consulted in that order as a fallback (each triggers aDeprecationWarning). If none of the three is set the computed default is used.- Returns:
EvaluationConfig with values from environment or defaults.
- Return type:
- cellmap_segmentation_challenge.utils.eval_utils.ratio_cutoff(nG: int, R_base: float = 10.0, R_extra: float = 50.0, k: float = 5.0) float[source]#
Calculate the acceptable ratio cutoff for instance matching.
The ratio cutoff decreases exponentially as the number of ground truth instances increases, allowing for more tolerance with fewer instances.
- Parameters:
nG (int) – Number of ground truth instances
R_base (float) – Base ratio cutoff (minimum)
R_extra (float) – Extra ratio tolerance for small nG
k (float) – Exponential decay factor
- Returns:
Maximum acceptable ratio of predicted to ground truth instances
- Return type:
float
- class cellmap_segmentation_challenge.utils.eval_utils.InstanceOverlapData(nG: int, nP: int, rows: ndarray, cols: ndarray, iou_vals: ndarray)[source]#
Bases:
objectData structure for instance overlap computation.
- Parameters:
nG (int)
nP (int)
rows (ndarray)
cols (ndarray)
iou_vals (ndarray)
- nG: int#
- nP: int#
- rows: ndarray#
- cols: ndarray#
- iou_vals: ndarray#
- cellmap_segmentation_challenge.utils.eval_utils.match_instances(gt: ndarray, pred: ndarray, config: EvaluationConfig | None = None) dict[int, int][source]#
Match instances between GT and prediction based on IoU.
Uses min-cost flow optimization to find optimal 1:1 matching between predicted and ground truth instances based on IoU overlap.
- Parameters:
gt (ndarray) – Ground truth instance labels (0 = background)
pred (ndarray) – Predicted instance labels (0 = background)
config (EvaluationConfig | None) – Evaluation configuration (uses defaults if None)
- Returns:
Dictionary mapping predicted instance ID to ground truth instance ID. Returns {0: 0} if only background present. Returns {} if no matches found or one side has no instances.
- Raises:
ValidationError – If array shapes don’t match
TooManyInstancesError – If pred/GT ratio exceeds threshold
TooManyOverlapEdgesError – If overlap computation is too large
MatchingFailedError – If optimization fails
- Return type:
dict[int, int]
Example
>>> mapping = match_instances(gt, pred) >>> # Remap predictions to match GT IDs >>> pred_aligned = remap(pred, mapping, preserve_missing_labels=True)
- cellmap_segmentation_challenge.utils.eval_utils.compute_max_distance(voxel_size, shape) float[source]#
Compute the maximum distance used for distance-based metrics, based on the maximum distance to a volume boundary.
- Return type:
float
- cellmap_segmentation_challenge.utils.eval_utils.normalize_distance(distance: float, voxel_size) float[source]#
Normalize a distance value to [0, 1] using the maximum distance represented by a voxel
- Parameters:
distance (float)
- Return type:
float
- cellmap_segmentation_challenge.utils.eval_utils.optimized_hausdorff_distances(truth_label, pred_label, voxel_size, hausdorff_distance_max, method='standard', percentile: float | None = None)[source]#
Compute per-truth-instance Hausdorff-like distances against the (already remapped) prediction using multithreading. Returns a 1D float32 numpy array whose i-th entry corresponds to truth_ids[i].
- Parameters:
truth_label (np.ndarray) – Ground-truth instance label volume (0 == background).
pred_label (np.ndarray) – Prediction instance label volume that has already been remapped to align with the GT ids (0 == background).
voxel_size (Sequence[float]) – Physical voxel sizes in Z, Y, X (or Y, X) order.
hausdorff_distance_max (float) – Cap for distances (use np.inf for uncapped).
method ({"standard", "modified", "percentile"}) –
“standard” -> classic Hausdorff (max of directed maxima) “modified” -> mean of directed distances, then max of the two means “percentile” -> use the given percentile of directed distances (requires
percentile to be provided).
percentile (float | None) – Percentile (0-100) used when method==”percentile”.
- cellmap_segmentation_challenge.utils.eval_utils.bbox_for_label(stats: StatisticsDict | StatisticsSlicesDict, ndim: int, label_id: int)[source]#
Try to get bbox without allocating a full boolean mask using cc3d statistics. Falls back to mask-based bbox if cc3d doesn’t provide expected fields. Returns (mins, maxs) inclusive-exclusive in voxel indices, or None if missing.
- Parameters:
stats (StatisticsDict | StatisticsSlicesDict)
ndim (int)
label_id (int)
- cellmap_segmentation_challenge.utils.eval_utils.roi_slices_for_pair(truth_stats: StatisticsDict | StatisticsSlicesDict, pred_stats: StatisticsDict | StatisticsSlicesDict, tid: int, voxel_size, ndim: int, shape: tuple[int, ...], max_distance: float)[source]#
ROI = union(bbox(truth==tid), bbox(pred==tid)) padded by P derived from max_distance. Returns tuple of slices suitable for numpy indexing.
- Parameters:
truth_stats (StatisticsDict | StatisticsSlicesDict)
pred_stats (StatisticsDict | StatisticsSlicesDict)
tid (int)
ndim (int)
shape (tuple[int, ...])
max_distance (float)
- cellmap_segmentation_challenge.utils.eval_utils.compute_hausdorff_distance_roi(truth_label: ndarray, truth_stats: StatisticsDict | StatisticsSlicesDict, pred_label: ndarray, pred_stats: StatisticsDict | StatisticsSlicesDict, tid: int, voxel_size, max_distance: float, method: str = 'standard', percentile: float | None = None)[source]#
Same metric as compute_hausdorff_distance(), but operates on an ROI slice and builds masks only inside ROI.
- Parameters:
truth_label (ndarray)
truth_stats (StatisticsDict | StatisticsSlicesDict)
pred_label (ndarray)
pred_stats (StatisticsDict | StatisticsSlicesDict)
tid (int)
max_distance (float)
method (str)
percentile (float | None)
- cellmap_segmentation_challenge.utils.eval_utils.score_instance(pred_label, truth_label, voxel_size, hausdorff_distance_max=None, config: EvaluationConfig | None = None) InstanceScoreDict[source]#
Score instance segmentation against ground truth.
Computes instance F1 score, Hausdorff distance, and combined metrics after optimal instance matching.
- Parameters:
pred_label – Predicted instance labels (0 = background)
truth_label – Ground truth instance labels (0 = background)
voxel_size – Physical voxel size in (Z, Y, X) order
hausdorff_distance_max – Maximum Hausdorff distance cap (None = auto)
config (EvaluationConfig | None) – Evaluation configuration (uses defaults if None)
- Returns:
Dictionary containing all instance segmentation metrics
- Return type:
Example
>>> scores = score_instance(pred, truth, voxel_size=(4.0, 4.0, 4.0)) >>> print(f"Combined score: {scores['combined_score']:.3f}")
- cellmap_segmentation_challenge.utils.eval_utils.score_semantic(pred_label, truth_label) dict[str, float][source]#
Score a single semantic label volume against the ground truth semantic label volume.
- Parameters:
pred_label (np.ndarray) – The predicted semantic label volume.
truth_label (np.ndarray) – The ground truth semantic label volume.
- Returns:
A dictionary of scores for the semantic label volume.
- Return type:
dict
- Example usage:
scores = score_semantic(pred_label, truth_label)
- cellmap_segmentation_challenge.utils.eval_utils.score_label(pred_label_path, label_name, crop_name, truth_path='/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/data/ground_truth.zarr', instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'])[source]#
Score a single label volume against the ground truth label volume.
- Parameters:
pred_label_path (str) – The path to the predicted label volume.
truth_path (str) – The path to the ground truth label volume.
instance_classes (list) – A list of instance classes.
- Returns:
A dictionary of scores for the label volume.
- Return type:
dict
- Example usage:
scores = score_label(‘pred.zarr/test_volume/label1’)
- cellmap_segmentation_challenge.utils.eval_utils.empty_label_score(label, crop_name, instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'], truth_path='/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/data/ground_truth.zarr')[source]#
- cellmap_segmentation_challenge.utils.eval_utils.match_crop_space(path, class_label, voxel_size, shape, translation) ndarray[source]#
- Return type:
ndarray
- cellmap_segmentation_challenge.utils.eval_utils.combine_scores(scores, include_missing=True, instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'], cast_to_none=[nan, inf, -inf, inf, -inf])[source]#
Combine scores across volumes, normalizing by the number of voxels.
- Parameters:
scores (dict) – A dictionary of scores for each volume, as returned by score_volume.
include_missing (bool) – Whether to include missing volumes in the combined scores.
instance_classes (list) – A list of instance classes.
cast_to_none (list) – A list of values to cast to None in the combined scores.
- Returns:
A dictionary of combined scores across all volumes.
- Return type:
dict
- Example usage:
combined_scores = combine_scores(scores)
- cellmap_segmentation_challenge.utils.eval_utils.sanitize_scores(scores)[source]#
Sanitize scores by converting NaN values to None.
- Parameters:
scores (dict) – A dictionary of scores.
- Returns:
A sanitized dictionary of scores.
- Return type:
dict
- cellmap_segmentation_challenge.utils.eval_utils.update_scores(scores, results, result_file, instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'])[source]#
- cellmap_segmentation_challenge.utils.eval_utils.ensure_zgroup(path: UPath) Group | Array[source]#
Ensure that the given path can be opened as a zarr Group. If a .zgroup is not present, add it.
- Parameters:
path (UPath)
- Return type:
Group | Array
- cellmap_segmentation_challenge.utils.eval_utils.ensure_valid_submission(submission_path: UPath)[source]#
Ensure that the unzipped submission path is a valid Zarr-2 file.
- Parameters:
submission_path (str) – The path to the unzipped submission Zarr-2 file.
- Raises:
ValueError – If the submission is not a valid unzipped Zarr-2 file.
- cellmap_segmentation_challenge.utils.eval_utils.get_evaluation_args(volumes, submission_path, truth_path='/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/data/ground_truth.zarr', instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance']) list[tuple][source]#
Get the arguments for scoring each label in the submission. :param volumes: A list of volumes to score. :type volumes: list :param submission_path: The path to the submission volume. :type submission_path: str :param truth_path: The path to the ground truth volume. :type truth_path: str :param instance_classes: A list of instance classes. :type instance_classes: list
- Returns:
A list of tuples containing the arguments for each label to be scored.
- Return type:
list[tuple]
- cellmap_segmentation_challenge.utils.eval_utils.missing_volume_score(truth_path, volume, instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance']) dict[source]#
Score a missing volume as 0’s, congruent with the score_volume function.
- Parameters:
truth_path (str) – The path to the ground truth volume.
volume (str) – The name of the volume.
instance_classes (list) – A list of instance classes.
- Returns:
A dictionary of scores for the volume.
- Return type:
dict
- Example usage:
scores = missing_volume_score(‘truth.zarr/test_volume’)
- cellmap_segmentation_challenge.utils.eval_utils.score_submission(submission_path='/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/data/submission.zip', result_file=None, truth_path='/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/data/ground_truth.zarr', instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'], config: EvaluationConfig | None = None)[source]#
Score a submission against the ground truth data.
This is the main entry point for evaluating a submission. It unzips, validates, scores, and aggregates results for all volumes.
- Parameters:
submission_path – Path to the zipped submission Zarr-2 file
result_file – Path to save the scores (None to skip saving)
truth_path – Path to the ground truth Zarr-2 file
instance_classes – List of instance segmentation classes
config (EvaluationConfig | None) – Evaluation configuration (uses defaults if None)
- Returns:
Dictionary of aggregated scores across all volumes
- Raises:
ValueError – If submission format is invalid
RuntimeError – If volume renaming fails
Example
>>> scores = score_submission('submission.zip', 'results.json') >>> print(f"Overall score: {scores['overall_score']:.4f}")
- Results structure:
- {
- “cropN”: { # Per-volume scores
- “label_name”: {
# Instance segmentation “f1”: float, “hausdorff_distance”: float, “combined_score”: float, # OR semantic segmentation “iou”: float, “dice_score”: float,
}
}, “label_scores”: { # Aggregated per-label
“label_name”: {…}
}, “overall_instance_score”: float, “overall_semantic_score”: float, “overall_score”: float,
}
- cellmap_segmentation_challenge.utils.eval_utils.resize_array(arr, target_shape, pad_value=0)[source]#
Resize an array to a target shape by padding or cropping as needed.
- Parameters:
arr (np.ndarray) – Input array to resize.
target_shape (tuple) – Desired shape for the output array.
pad_value (int, float, etc.) – Value to use for padding if the array is smaller than the target shape.
- Returns:
Resized array with the specified target shape.
- Return type:
np.ndarray
- cellmap_segmentation_challenge.utils.eval_utils.unzip_file(zip_path, max_uncompressed_size: int = 53687091200)[source]#
Unzip a zip file to a specified directory.
Validates against path traversal (zip slip), symlink attacks, and decompression bombs before extracting.
- Parameters:
zip_path (str) – The path to the zip file.
max_uncompressed_size (int) – Maximum total uncompressed size in bytes.
- Raises:
ValueError – If zip_path is a directory rather than a zip file.
ValidationError – If any member fails security checks or total size exceeds limit.
- Example usage:
unzip_file(‘submission.zip’)