cellmap_segmentation_challenge.utils.eval_utils package#
Submodules#
cellmap_segmentation_challenge.utils.eval_utils.aggregation module#
Score aggregation and result processing utilities.
- cellmap_segmentation_challenge.utils.eval_utils.aggregation.combine_scores(scores, include_missing=True, instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'], cast_to_none=[nan, inf, -inf, inf, -inf])[source]#
Combine scores across volumes, normalizing by the number of voxels.
- Parameters:
scores (dict) – A dictionary of scores for each volume, as returned by score_volume.
include_missing (bool) – Whether to include missing volumes in the combined scores.
instance_classes (list) – A list of instance classes.
cast_to_none (list) – A list of values to cast to None in the combined scores.
- Returns:
A dictionary of combined scores across all volumes.
- Return type:
dict
- Example usage:
combined_scores = combine_scores(scores)
cellmap_segmentation_challenge.utils.eval_utils.array_utils module#
Array manipulation utilities for evaluation.
- cellmap_segmentation_challenge.utils.eval_utils.array_utils.resize_array(arr, target_shape, pad_value=0)[source]#
Resize an array to a target shape by padding or cropping as needed.
- Parameters:
arr (np.ndarray) – Input array to resize.
target_shape (tuple) – Desired shape for the output array.
pad_value (int, float, etc.) – Value to use for padding if the array is smaller than the target shape.
- Returns:
Resized array with the specified target shape.
- Return type:
np.ndarray
cellmap_segmentation_challenge.utils.eval_utils.config module#
Configuration for evaluation pipeline.
- class cellmap_segmentation_challenge.utils.eval_utils.config.EvaluationConfig(max_instance_threads: int = 3, max_semantic_threads: int = 25, per_instance_threads: int = 25, max_distance_cap_eps: float = 0.0001, final_instance_ratio_cutoff: float = 10.0, initial_instance_ratio_cutoff: float = 50.0, instance_ratio_factor: float = 5.0, max_overlap_edges: int = 5000000, mcmf_cost_scale: int = 1000000, truth_path: UPath = <factory>, instance_classes: list[str] = <factory>, cast_to_none: list[Any] = <factory>)[source]#
Bases:
objectConfiguration for evaluation pipeline.
All parameters can be set via environment variables or passed directly. Environment variables take precedence over defaults but not over explicitly passed values.
- Parameters:
max_instance_threads (int)
max_semantic_threads (int)
per_instance_threads (int)
max_distance_cap_eps (float)
final_instance_ratio_cutoff (float)
initial_instance_ratio_cutoff (float)
instance_ratio_factor (float)
max_overlap_edges (int)
mcmf_cost_scale (int)
truth_path (UPath)
instance_classes (list[str])
cast_to_none (list[Any])
- max_instance_threads: int = 3#
- max_semantic_threads: int = 25#
- per_instance_threads: int = 25#
- max_distance_cap_eps: float = 0.0001#
- final_instance_ratio_cutoff: float = 10.0#
- initial_instance_ratio_cutoff: float = 50.0#
- instance_ratio_factor: float = 5.0#
- max_overlap_edges: int = 5000000#
- mcmf_cost_scale: int = 1000000#
- truth_path: UPath#
- instance_classes: list[str]#
- cast_to_none: list[Any]#
- classmethod from_env() EvaluationConfig[source]#
Load configuration from environment variables with defaults.
- Returns:
EvaluationConfig with values from environment or defaults.
- Return type:
- cellmap_segmentation_challenge.utils.eval_utils.config.ratio_cutoff(nG: int, R_base: float = 10.0, R_extra: float = 50.0, k: float = 5.0) float[source]#
Calculate the acceptable ratio cutoff for instance matching.
The ratio cutoff decreases exponentially as the number of ground truth instances increases, allowing for more tolerance with fewer instances.
- Parameters:
nG (int) – Number of ground truth instances
R_base (float) – Base ratio cutoff (minimum)
R_extra (float) – Extra ratio tolerance for small nG
k (float) – Exponential decay factor
- Returns:
Maximum acceptable ratio of predicted to ground truth instances
- Return type:
float
cellmap_segmentation_challenge.utils.eval_utils.distance module#
Distance metrics including Hausdorff distance computation.
- cellmap_segmentation_challenge.utils.eval_utils.distance.compute_max_distance(voxel_size, shape) float[source]#
Compute the maximum distance used for distance-based metrics, based on the maximum distance to a volume boundary.
- Return type:
float
- cellmap_segmentation_challenge.utils.eval_utils.distance.normalize_distance(distance: float, voxel_size) float[source]#
Normalize a distance value to [0, 1] using the maximum distance represented by a voxel
- Parameters:
distance (float)
- Return type:
float
- cellmap_segmentation_challenge.utils.eval_utils.distance.optimized_hausdorff_distances(truth_label, pred_label, voxel_size, hausdorff_distance_max, method='standard', percentile: float | None = None)[source]#
Compute per-truth-instance Hausdorff-like distances against the (already remapped) prediction using multithreading. Returns a 1D float32 numpy array whose i-th entry corresponds to truth_ids[i].
- Parameters:
truth_label (np.ndarray) – Ground-truth instance label volume (0 == background).
pred_label (np.ndarray) – Prediction instance label volume that has already been remapped to align with the GT ids (0 == background).
voxel_size (Sequence[float]) – Physical voxel sizes in Z, Y, X (or Y, X) order.
hausdorff_distance_max (float) – Cap for distances (use np.inf for uncapped).
method ({"standard", "modified", "percentile"}) –
“standard” -> classic Hausdorff (max of directed maxima) “modified” -> mean of directed distances, then max of the two means “percentile” -> use the given percentile of directed distances (requires
percentile to be provided).
percentile (float | None) – Percentile (0-100) used when method==”percentile”.
- cellmap_segmentation_challenge.utils.eval_utils.distance.bbox_for_label(stats: StatisticsDict | StatisticsSlicesDict, ndim: int, label_id: int)[source]#
Try to get bbox without allocating a full boolean mask using cc3d statistics. Falls back to mask-based bbox if cc3d doesn’t provide expected fields. Returns (mins, maxs) inclusive-exclusive in voxel indices, or None if missing.
- Parameters:
stats (StatisticsDict | StatisticsSlicesDict)
ndim (int)
label_id (int)
- cellmap_segmentation_challenge.utils.eval_utils.distance.roi_slices_for_pair(truth_stats: StatisticsDict | StatisticsSlicesDict, pred_stats: StatisticsDict | StatisticsSlicesDict, tid: int, voxel_size, ndim: int, shape: tuple[int, ...], max_distance: float)[source]#
ROI = union(bbox(truth==tid), bbox(pred==tid)) padded by P derived from max_distance. Returns tuple of slices suitable for numpy indexing.
- Parameters:
truth_stats (StatisticsDict | StatisticsSlicesDict)
pred_stats (StatisticsDict | StatisticsSlicesDict)
tid (int)
ndim (int)
shape (tuple[int, ...])
max_distance (float)
- cellmap_segmentation_challenge.utils.eval_utils.distance.compute_hausdorff_distance_roi(truth_label: ndarray, truth_stats: StatisticsDict | StatisticsSlicesDict, pred_label: ndarray, pred_stats: StatisticsDict | StatisticsSlicesDict, tid: int, voxel_size, max_distance: float, method: str = 'standard', percentile: float | None = None)[source]#
Same metric as compute_hausdorff_distance(), but operates on an ROI slice and builds masks only inside ROI.
- Parameters:
truth_label (ndarray)
truth_stats (StatisticsDict | StatisticsSlicesDict)
pred_label (ndarray)
pred_stats (StatisticsDict | StatisticsSlicesDict)
tid (int)
max_distance (float)
method (str)
percentile (float | None)
cellmap_segmentation_challenge.utils.eval_utils.exceptions module#
Custom exceptions for evaluation pipeline.
- exception cellmap_segmentation_challenge.utils.eval_utils.exceptions.EvaluationError[source]#
Bases:
ExceptionBase exception for evaluation errors.
- exception cellmap_segmentation_challenge.utils.eval_utils.exceptions.TooManyInstancesError(n_pred: int, n_gt: int, ratio: float, cutoff: float)[source]#
Bases:
EvaluationErrorRaised when submission has too many instances relative to ground truth.
This is a pathological case where the ratio of predicted to ground truth instances exceeds acceptable thresholds, likely indicating poor segmentation.
- Parameters:
n_pred (int)
n_gt (int)
ratio (float)
cutoff (float)
- exception cellmap_segmentation_challenge.utils.eval_utils.exceptions.TooManyOverlapEdgesError(n_edges: int, max_edges: int)[source]#
Bases:
EvaluationErrorRaised when instance matching produces too many overlap edges.
This indicates computational infeasibility for the matching algorithm.
- Parameters:
n_edges (int)
max_edges (int)
- exception cellmap_segmentation_challenge.utils.eval_utils.exceptions.MatchingFailedError(status: int)[source]#
Bases:
EvaluationErrorRaised when instance matching optimization fails.
- Parameters:
status (int)
- exception cellmap_segmentation_challenge.utils.eval_utils.exceptions.ValidationError[source]#
Bases:
EvaluationErrorRaised when input validation fails.
cellmap_segmentation_challenge.utils.eval_utils.instance_matching module#
Instance matching using min-cost flow optimization.
- class cellmap_segmentation_challenge.utils.eval_utils.instance_matching.InstanceOverlapData(nG: int, nP: int, rows: ndarray, cols: ndarray, iou_vals: ndarray)[source]#
Bases:
objectData structure for instance overlap computation.
- Parameters:
nG (int)
nP (int)
rows (ndarray)
cols (ndarray)
iou_vals (ndarray)
- nG: int#
- nP: int#
- rows: ndarray#
- cols: ndarray#
- iou_vals: ndarray#
- cellmap_segmentation_challenge.utils.eval_utils.instance_matching.match_instances(gt: ndarray, pred: ndarray, config: EvaluationConfig | None = None) dict[int, int][source]#
Match instances between GT and prediction based on IoU.
Uses min-cost flow optimization to find optimal 1:1 matching between predicted and ground truth instances based on IoU overlap.
- Parameters:
gt (ndarray) – Ground truth instance labels (0 = background)
pred (ndarray) – Predicted instance labels (0 = background)
config (EvaluationConfig | None) – Evaluation configuration (uses defaults if None)
- Returns:
Dictionary mapping predicted instance ID to ground truth instance ID. Returns {0: 0} if only background present. Returns {} if no matches found or one side has no instances.
- Raises:
ValidationError – If array shapes don’t match
TooManyInstancesError – If pred/GT ratio exceeds threshold
TooManyOverlapEdgesError – If overlap computation is too large
MatchingFailedError – If optimization fails
- Return type:
dict[int, int]
Example
>>> mapping = match_instances(gt, pred) >>> # Remap predictions to match GT IDs >>> pred_aligned = remap(pred, mapping, preserve_missing_labels=True)
cellmap_segmentation_challenge.utils.eval_utils.scoring module#
Core scoring functions for segmentation evaluation.
- cellmap_segmentation_challenge.utils.eval_utils.scoring.score_instance(pred_label, truth_label, voxel_size, hausdorff_distance_max=None, config: EvaluationConfig | None = None) InstanceScoreDict[source]#
Score instance segmentation against ground truth.
Computes pixel-wise accuracy, Hausdorff distance, and combined metrics after optimal instance matching.
- Parameters:
pred_label – Predicted instance labels (0 = background)
truth_label – Ground truth instance labels (0 = background)
voxel_size – Physical voxel size in (Z, Y, X) order
hausdorff_distance_max – Maximum Hausdorff distance cap (None = auto)
config (EvaluationConfig | None) – Evaluation configuration (uses defaults if None)
- Returns:
Dictionary containing all instance segmentation metrics
- Return type:
Example
>>> scores = score_instance(pred, truth, voxel_size=(4.0, 4.0, 4.0)) >>> print(f"Combined score: {scores['combined_score']:.3f}")
- cellmap_segmentation_challenge.utils.eval_utils.scoring.score_semantic(pred_label, truth_label) dict[str, float][source]#
Score a single semantic label volume against the ground truth semantic label volume.
- Parameters:
pred_label (np.ndarray) – The predicted semantic label volume.
truth_label (np.ndarray) – The ground truth semantic label volume.
- Returns:
A dictionary of scores for the semantic label volume.
- Return type:
dict
- Example usage:
scores = score_semantic(pred_label, truth_label)
- cellmap_segmentation_challenge.utils.eval_utils.scoring.score_label(pred_label_path, label_name, crop_name, truth_path='/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/data/ground_truth.zarr', instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'])[source]#
Score a single label volume against the ground truth label volume.
- Parameters:
pred_label_path (str) – The path to the predicted label volume.
truth_path (str) – The path to the ground truth label volume.
instance_classes (list) – A list of instance classes.
- Returns:
A dictionary of scores for the label volume.
- Return type:
dict
- Example usage:
scores = score_label(‘pred.zarr/test_volume/label1’)
- cellmap_segmentation_challenge.utils.eval_utils.scoring.empty_label_score(label, crop_name, instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'], truth_path='/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/data/ground_truth.zarr')[source]#
cellmap_segmentation_challenge.utils.eval_utils.submission module#
Submission processing and main evaluation entry point.
- cellmap_segmentation_challenge.utils.eval_utils.submission.ensure_zgroup(path: UPath) Group[source]#
Ensure that the given path can be opened as a zarr Group. If a .zgroup is not present, add it.
- Parameters:
path (UPath)
- Return type:
Group
- cellmap_segmentation_challenge.utils.eval_utils.submission.get_evaluation_args(volumes, submission_path, truth_path='/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/data/ground_truth.zarr', instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance']) list[tuple][source]#
Get the arguments for scoring each label in the submission. :param volumes: A list of volumes to score. :type volumes: list :param submission_path: The path to the submission volume. :type submission_path: str :param truth_path: The path to the ground truth volume. :type truth_path: str :param instance_classes: A list of instance classes. :type instance_classes: list
- Returns:
A list of tuples containing the arguments for each label to be scored.
- Return type:
list[tuple]
- cellmap_segmentation_challenge.utils.eval_utils.submission.missing_volume_score(truth_path, volume, instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance']) list[tuple][source]#
Score a missing volume as 0’s, congruent with the score_volume function.
- Parameters:
truth_path (str) – The path to the ground truth volume.
volume (str) – The name of the volume.
instance_classes (list) – A list of instance classes.
- Returns:
A dictionary of scores for the volume.
- Return type:
dict
- Example usage:
scores = missing_volume_score(‘truth.zarr/test_volume’)
- cellmap_segmentation_challenge.utils.eval_utils.submission.ensure_valid_submission(submission_path: UPath)[source]#
Ensure that the unzipped submission path is a valid Zarr-2 file.
- Parameters:
submission_path (str) – The path to the unzipped submission Zarr-2 file.
- Raises:
ValueError – If the submission is not a valid unzipped Zarr-2 file.
- cellmap_segmentation_challenge.utils.eval_utils.submission.score_submission(submission_path='/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/data/submission.zip', result_file=None, truth_path='/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/data/ground_truth.zarr', instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'], config: EvaluationConfig | None = None)[source]#
Score a submission against the ground truth data.
This is the main entry point for evaluating a submission. It unzips, validates, scores, and aggregates results for all volumes.
- Parameters:
submission_path – Path to the zipped submission Zarr-2 file
result_file – Path to save the scores (None to skip saving)
truth_path – Path to the ground truth Zarr-2 file
instance_classes – List of instance segmentation classes
config (EvaluationConfig | None) – Evaluation configuration (uses defaults if None)
- Returns:
Dictionary of aggregated scores across all volumes
- Raises:
ValueError – If submission format is invalid
RuntimeError – If volume renaming fails
Example
>>> scores = score_submission('submission.zip', 'results.json') >>> print(f"Overall score: {scores['overall_score']:.4f}")
- Results structure:
- {
- “cropN”: { # Per-volume scores
- “label_name”: {
# Instance segmentation “mean_accuracy”: float, “hausdorff_distance”: float, “combined_score”: float, # OR semantic segmentation “iou”: float, “dice_score”: float,
}
}, “label_scores”: { # Aggregated per-label
“label_name”: {…}
}, “overall_instance_score”: float, “overall_semantic_score”: float, “overall_score”: float,
}
cellmap_segmentation_challenge.utils.eval_utils.types module#
Type definitions for evaluation metrics.
- class cellmap_segmentation_challenge.utils.eval_utils.types.InstanceScoreDict[source]#
Bases:
TypedDictType definition for instance segmentation scores.
- mean_accuracy: float#
- binary_accuracy: float#
- hausdorff_distance: float#
- normalized_hausdorff_distance: float#
- combined_score: float#
- iou: float#
- dice_score: float#
- num_voxels: int#
- voxel_size: tuple[float, ...]#
- is_missing: bool#
- status: Literal['scored', 'skipped_too_many_instances', 'missing']#
- voi_split: float#
- voi_merge: float#
- class cellmap_segmentation_challenge.utils.eval_utils.types.SemanticScoreDict[source]#
Bases:
TypedDictType definition for semantic segmentation scores.
- iou: float#
- dice_score: float#
- binary_accuracy: float#
- num_voxels: int#
- voxel_size: tuple[float, ...]#
- is_missing: bool#
- status: Literal['scored', 'missing']#
cellmap_segmentation_challenge.utils.eval_utils.zip_utils module#
Zip file handling utilities for submissions.
- cellmap_segmentation_challenge.utils.eval_utils.zip_utils.unzip_file(zip_path, max_uncompressed_size: int = 53687091200)[source]#
Unzip a zip file to a specified directory.
Validates against path traversal (zip slip), symlink attacks, and decompression bombs before extracting.
- Parameters:
zip_path (str) – The path to the zip file.
max_uncompressed_size (int) – Maximum total uncompressed size in bytes.
- Raises:
ValidationError – If any member fails security checks or total size exceeds limit.
- Example usage:
unzip_file(‘submission.zip’)
Module contents#
Evaluation utilities for cellmap segmentation challenge.
This module provides all evaluation-related functionality including: - Scoring for instance and semantic segmentation - Instance matching using min-cost flow optimization - Hausdorff distance and other metrics - Submission processing and validation
- class cellmap_segmentation_challenge.utils.eval_utils.InstanceScoreDict[source]#
Bases:
TypedDictType definition for instance segmentation scores.
- mean_accuracy: float#
- binary_accuracy: float#
- hausdorff_distance: float#
- normalized_hausdorff_distance: float#
- combined_score: float#
- iou: float#
- dice_score: float#
- num_voxels: int#
- voxel_size: tuple[float, ...]#
- is_missing: bool#
- status: Literal['scored', 'skipped_too_many_instances', 'missing']#
- voi_split: float#
- voi_merge: float#
- class cellmap_segmentation_challenge.utils.eval_utils.SemanticScoreDict[source]#
Bases:
TypedDictType definition for semantic segmentation scores.
- iou: float#
- dice_score: float#
- binary_accuracy: float#
- num_voxels: int#
- voxel_size: tuple[float, ...]#
- is_missing: bool#
- status: Literal['scored', 'missing']#
- exception cellmap_segmentation_challenge.utils.eval_utils.EvaluationError[source]#
Bases:
ExceptionBase exception for evaluation errors.
- exception cellmap_segmentation_challenge.utils.eval_utils.TooManyInstancesError(n_pred: int, n_gt: int, ratio: float, cutoff: float)[source]#
Bases:
EvaluationErrorRaised when submission has too many instances relative to ground truth.
This is a pathological case where the ratio of predicted to ground truth instances exceeds acceptable thresholds, likely indicating poor segmentation.
- Parameters:
n_pred (int)
n_gt (int)
ratio (float)
cutoff (float)
- exception cellmap_segmentation_challenge.utils.eval_utils.TooManyOverlapEdgesError(n_edges: int, max_edges: int)[source]#
Bases:
EvaluationErrorRaised when instance matching produces too many overlap edges.
This indicates computational infeasibility for the matching algorithm.
- Parameters:
n_edges (int)
max_edges (int)
- exception cellmap_segmentation_challenge.utils.eval_utils.MatchingFailedError(status: int)[source]#
Bases:
EvaluationErrorRaised when instance matching optimization fails.
- Parameters:
status (int)
- exception cellmap_segmentation_challenge.utils.eval_utils.ValidationError[source]#
Bases:
EvaluationErrorRaised when input validation fails.
- class cellmap_segmentation_challenge.utils.eval_utils.EvaluationConfig(max_instance_threads: int = 3, max_semantic_threads: int = 25, per_instance_threads: int = 25, max_distance_cap_eps: float = 0.0001, final_instance_ratio_cutoff: float = 10.0, initial_instance_ratio_cutoff: float = 50.0, instance_ratio_factor: float = 5.0, max_overlap_edges: int = 5000000, mcmf_cost_scale: int = 1000000, truth_path: UPath = <factory>, instance_classes: list[str] = <factory>, cast_to_none: list[Any] = <factory>)[source]#
Bases:
objectConfiguration for evaluation pipeline.
All parameters can be set via environment variables or passed directly. Environment variables take precedence over defaults but not over explicitly passed values.
- Parameters:
max_instance_threads (int)
max_semantic_threads (int)
per_instance_threads (int)
max_distance_cap_eps (float)
final_instance_ratio_cutoff (float)
initial_instance_ratio_cutoff (float)
instance_ratio_factor (float)
max_overlap_edges (int)
mcmf_cost_scale (int)
truth_path (UPath)
instance_classes (list[str])
cast_to_none (list[Any])
- max_instance_threads: int = 3#
- max_semantic_threads: int = 25#
- per_instance_threads: int = 25#
- max_distance_cap_eps: float = 0.0001#
- final_instance_ratio_cutoff: float = 10.0#
- initial_instance_ratio_cutoff: float = 50.0#
- instance_ratio_factor: float = 5.0#
- max_overlap_edges: int = 5000000#
- mcmf_cost_scale: int = 1000000#
- truth_path: UPath#
- instance_classes: list[str]#
- cast_to_none: list[Any]#
- classmethod from_env() EvaluationConfig[source]#
Load configuration from environment variables with defaults.
- Returns:
EvaluationConfig with values from environment or defaults.
- Return type:
- cellmap_segmentation_challenge.utils.eval_utils.ratio_cutoff(nG: int, R_base: float = 10.0, R_extra: float = 50.0, k: float = 5.0) float[source]#
Calculate the acceptable ratio cutoff for instance matching.
The ratio cutoff decreases exponentially as the number of ground truth instances increases, allowing for more tolerance with fewer instances.
- Parameters:
nG (int) – Number of ground truth instances
R_base (float) – Base ratio cutoff (minimum)
R_extra (float) – Extra ratio tolerance for small nG
k (float) – Exponential decay factor
- Returns:
Maximum acceptable ratio of predicted to ground truth instances
- Return type:
float
- class cellmap_segmentation_challenge.utils.eval_utils.InstanceOverlapData(nG: int, nP: int, rows: ndarray, cols: ndarray, iou_vals: ndarray)[source]#
Bases:
objectData structure for instance overlap computation.
- Parameters:
nG (int)
nP (int)
rows (ndarray)
cols (ndarray)
iou_vals (ndarray)
- nG: int#
- nP: int#
- rows: ndarray#
- cols: ndarray#
- iou_vals: ndarray#
- cellmap_segmentation_challenge.utils.eval_utils.match_instances(gt: ndarray, pred: ndarray, config: EvaluationConfig | None = None) dict[int, int][source]#
Match instances between GT and prediction based on IoU.
Uses min-cost flow optimization to find optimal 1:1 matching between predicted and ground truth instances based on IoU overlap.
- Parameters:
gt (ndarray) – Ground truth instance labels (0 = background)
pred (ndarray) – Predicted instance labels (0 = background)
config (EvaluationConfig | None) – Evaluation configuration (uses defaults if None)
- Returns:
Dictionary mapping predicted instance ID to ground truth instance ID. Returns {0: 0} if only background present. Returns {} if no matches found or one side has no instances.
- Raises:
ValidationError – If array shapes don’t match
TooManyInstancesError – If pred/GT ratio exceeds threshold
TooManyOverlapEdgesError – If overlap computation is too large
MatchingFailedError – If optimization fails
- Return type:
dict[int, int]
Example
>>> mapping = match_instances(gt, pred) >>> # Remap predictions to match GT IDs >>> pred_aligned = remap(pred, mapping, preserve_missing_labels=True)
- cellmap_segmentation_challenge.utils.eval_utils.compute_max_distance(voxel_size, shape) float[source]#
Compute the maximum distance used for distance-based metrics, based on the maximum distance to a volume boundary.
- Return type:
float
- cellmap_segmentation_challenge.utils.eval_utils.normalize_distance(distance: float, voxel_size) float[source]#
Normalize a distance value to [0, 1] using the maximum distance represented by a voxel
- Parameters:
distance (float)
- Return type:
float
- cellmap_segmentation_challenge.utils.eval_utils.optimized_hausdorff_distances(truth_label, pred_label, voxel_size, hausdorff_distance_max, method='standard', percentile: float | None = None)[source]#
Compute per-truth-instance Hausdorff-like distances against the (already remapped) prediction using multithreading. Returns a 1D float32 numpy array whose i-th entry corresponds to truth_ids[i].
- Parameters:
truth_label (np.ndarray) – Ground-truth instance label volume (0 == background).
pred_label (np.ndarray) – Prediction instance label volume that has already been remapped to align with the GT ids (0 == background).
voxel_size (Sequence[float]) – Physical voxel sizes in Z, Y, X (or Y, X) order.
hausdorff_distance_max (float) – Cap for distances (use np.inf for uncapped).
method ({"standard", "modified", "percentile"}) –
“standard” -> classic Hausdorff (max of directed maxima) “modified” -> mean of directed distances, then max of the two means “percentile” -> use the given percentile of directed distances (requires
percentile to be provided).
percentile (float | None) – Percentile (0-100) used when method==”percentile”.
- cellmap_segmentation_challenge.utils.eval_utils.bbox_for_label(stats: StatisticsDict | StatisticsSlicesDict, ndim: int, label_id: int)[source]#
Try to get bbox without allocating a full boolean mask using cc3d statistics. Falls back to mask-based bbox if cc3d doesn’t provide expected fields. Returns (mins, maxs) inclusive-exclusive in voxel indices, or None if missing.
- Parameters:
stats (StatisticsDict | StatisticsSlicesDict)
ndim (int)
label_id (int)
- cellmap_segmentation_challenge.utils.eval_utils.roi_slices_for_pair(truth_stats: StatisticsDict | StatisticsSlicesDict, pred_stats: StatisticsDict | StatisticsSlicesDict, tid: int, voxel_size, ndim: int, shape: tuple[int, ...], max_distance: float)[source]#
ROI = union(bbox(truth==tid), bbox(pred==tid)) padded by P derived from max_distance. Returns tuple of slices suitable for numpy indexing.
- Parameters:
truth_stats (StatisticsDict | StatisticsSlicesDict)
pred_stats (StatisticsDict | StatisticsSlicesDict)
tid (int)
ndim (int)
shape (tuple[int, ...])
max_distance (float)
- cellmap_segmentation_challenge.utils.eval_utils.compute_hausdorff_distance_roi(truth_label: ndarray, truth_stats: StatisticsDict | StatisticsSlicesDict, pred_label: ndarray, pred_stats: StatisticsDict | StatisticsSlicesDict, tid: int, voxel_size, max_distance: float, method: str = 'standard', percentile: float | None = None)[source]#
Same metric as compute_hausdorff_distance(), but operates on an ROI slice and builds masks only inside ROI.
- Parameters:
truth_label (ndarray)
truth_stats (StatisticsDict | StatisticsSlicesDict)
pred_label (ndarray)
pred_stats (StatisticsDict | StatisticsSlicesDict)
tid (int)
max_distance (float)
method (str)
percentile (float | None)
- cellmap_segmentation_challenge.utils.eval_utils.score_instance(pred_label, truth_label, voxel_size, hausdorff_distance_max=None, config: EvaluationConfig | None = None) InstanceScoreDict[source]#
Score instance segmentation against ground truth.
Computes pixel-wise accuracy, Hausdorff distance, and combined metrics after optimal instance matching.
- Parameters:
pred_label – Predicted instance labels (0 = background)
truth_label – Ground truth instance labels (0 = background)
voxel_size – Physical voxel size in (Z, Y, X) order
hausdorff_distance_max – Maximum Hausdorff distance cap (None = auto)
config (EvaluationConfig | None) – Evaluation configuration (uses defaults if None)
- Returns:
Dictionary containing all instance segmentation metrics
- Return type:
Example
>>> scores = score_instance(pred, truth, voxel_size=(4.0, 4.0, 4.0)) >>> print(f"Combined score: {scores['combined_score']:.3f}")
- cellmap_segmentation_challenge.utils.eval_utils.score_semantic(pred_label, truth_label) dict[str, float][source]#
Score a single semantic label volume against the ground truth semantic label volume.
- Parameters:
pred_label (np.ndarray) – The predicted semantic label volume.
truth_label (np.ndarray) – The ground truth semantic label volume.
- Returns:
A dictionary of scores for the semantic label volume.
- Return type:
dict
- Example usage:
scores = score_semantic(pred_label, truth_label)
- cellmap_segmentation_challenge.utils.eval_utils.score_label(pred_label_path, label_name, crop_name, truth_path='/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/data/ground_truth.zarr', instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'])[source]#
Score a single label volume against the ground truth label volume.
- Parameters:
pred_label_path (str) – The path to the predicted label volume.
truth_path (str) – The path to the ground truth label volume.
instance_classes (list) – A list of instance classes.
- Returns:
A dictionary of scores for the label volume.
- Return type:
dict
- Example usage:
scores = score_label(‘pred.zarr/test_volume/label1’)
- cellmap_segmentation_challenge.utils.eval_utils.empty_label_score(label, crop_name, instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'], truth_path='/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/data/ground_truth.zarr')[source]#
- cellmap_segmentation_challenge.utils.eval_utils.match_crop_space(path, class_label, voxel_size, shape, translation) ndarray[source]#
- Return type:
ndarray
- cellmap_segmentation_challenge.utils.eval_utils.combine_scores(scores, include_missing=True, instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'], cast_to_none=[nan, inf, -inf, inf, -inf])[source]#
Combine scores across volumes, normalizing by the number of voxels.
- Parameters:
scores (dict) – A dictionary of scores for each volume, as returned by score_volume.
include_missing (bool) – Whether to include missing volumes in the combined scores.
instance_classes (list) – A list of instance classes.
cast_to_none (list) – A list of values to cast to None in the combined scores.
- Returns:
A dictionary of combined scores across all volumes.
- Return type:
dict
- Example usage:
combined_scores = combine_scores(scores)
- cellmap_segmentation_challenge.utils.eval_utils.sanitize_scores(scores)[source]#
Sanitize scores by converting NaN values to None.
- Parameters:
scores (dict) – A dictionary of scores.
- Returns:
A sanitized dictionary of scores.
- Return type:
dict
- cellmap_segmentation_challenge.utils.eval_utils.update_scores(scores, results, result_file, instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'])[source]#
- cellmap_segmentation_challenge.utils.eval_utils.ensure_zgroup(path: UPath) Group[source]#
Ensure that the given path can be opened as a zarr Group. If a .zgroup is not present, add it.
- Parameters:
path (UPath)
- Return type:
Group
- cellmap_segmentation_challenge.utils.eval_utils.ensure_valid_submission(submission_path: UPath)[source]#
Ensure that the unzipped submission path is a valid Zarr-2 file.
- Parameters:
submission_path (str) – The path to the unzipped submission Zarr-2 file.
- Raises:
ValueError – If the submission is not a valid unzipped Zarr-2 file.
- cellmap_segmentation_challenge.utils.eval_utils.get_evaluation_args(volumes, submission_path, truth_path='/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/data/ground_truth.zarr', instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance']) list[tuple][source]#
Get the arguments for scoring each label in the submission. :param volumes: A list of volumes to score. :type volumes: list :param submission_path: The path to the submission volume. :type submission_path: str :param truth_path: The path to the ground truth volume. :type truth_path: str :param instance_classes: A list of instance classes. :type instance_classes: list
- Returns:
A list of tuples containing the arguments for each label to be scored.
- Return type:
list[tuple]
- cellmap_segmentation_challenge.utils.eval_utils.missing_volume_score(truth_path, volume, instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance']) list[tuple][source]#
Score a missing volume as 0’s, congruent with the score_volume function.
- Parameters:
truth_path (str) – The path to the ground truth volume.
volume (str) – The name of the volume.
instance_classes (list) – A list of instance classes.
- Returns:
A dictionary of scores for the volume.
- Return type:
dict
- Example usage:
scores = missing_volume_score(‘truth.zarr/test_volume’)
- cellmap_segmentation_challenge.utils.eval_utils.score_submission(submission_path='/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/data/submission.zip', result_file=None, truth_path='/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/data/ground_truth.zarr', instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'], config: EvaluationConfig | None = None)[source]#
Score a submission against the ground truth data.
This is the main entry point for evaluating a submission. It unzips, validates, scores, and aggregates results for all volumes.
- Parameters:
submission_path – Path to the zipped submission Zarr-2 file
result_file – Path to save the scores (None to skip saving)
truth_path – Path to the ground truth Zarr-2 file
instance_classes – List of instance segmentation classes
config (EvaluationConfig | None) – Evaluation configuration (uses defaults if None)
- Returns:
Dictionary of aggregated scores across all volumes
- Raises:
ValueError – If submission format is invalid
RuntimeError – If volume renaming fails
Example
>>> scores = score_submission('submission.zip', 'results.json') >>> print(f"Overall score: {scores['overall_score']:.4f}")
- Results structure:
- {
- “cropN”: { # Per-volume scores
- “label_name”: {
# Instance segmentation “mean_accuracy”: float, “hausdorff_distance”: float, “combined_score”: float, # OR semantic segmentation “iou”: float, “dice_score”: float,
}
}, “label_scores”: { # Aggregated per-label
“label_name”: {…}
}, “overall_instance_score”: float, “overall_semantic_score”: float, “overall_score”: float,
}
- cellmap_segmentation_challenge.utils.eval_utils.resize_array(arr, target_shape, pad_value=0)[source]#
Resize an array to a target shape by padding or cropping as needed.
- Parameters:
arr (np.ndarray) – Input array to resize.
target_shape (tuple) – Desired shape for the output array.
pad_value (int, float, etc.) – Value to use for padding if the array is smaller than the target shape.
- Returns:
Resized array with the specified target shape.
- Return type:
np.ndarray
- cellmap_segmentation_challenge.utils.eval_utils.unzip_file(zip_path, max_uncompressed_size: int = 53687091200)[source]#
Unzip a zip file to a specified directory.
Validates against path traversal (zip slip), symlink attacks, and decompression bombs before extracting.
- Parameters:
zip_path (str) – The path to the zip file.
max_uncompressed_size (int) – Maximum total uncompressed size in bytes.
- Raises:
ValidationError – If any member fails security checks or total size exceeds limit.
- Example usage:
unzip_file(‘submission.zip’)