cellmap_segmentation_challenge.utils.eval_utils package

cellmap_segmentation_challenge.utils.eval_utils package#

Submodules#

cellmap_segmentation_challenge.utils.eval_utils.aggregation module#

Score aggregation and result processing utilities.

cellmap_segmentation_challenge.utils.eval_utils.aggregation.combine_scores(scores, include_missing=True, instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'], cast_to_none=[nan, inf, -inf, inf, -inf])[source]#

Combine scores across volumes, normalizing by the number of voxels.

Parameters:

scores (dict) – A dictionary of scores for each volume, as returned by score_volume.
include_missing (bool) – Whether to include missing volumes in the combined scores.
instance_classes (list) – A list of instance classes.
cast_to_none (list) – A list of values to cast to None in the combined scores.

Returns:

A dictionary of combined scores across all volumes.

Return type:

dict

Example usage:: combined_scores = combine_scores(scores)

cellmap_segmentation_challenge.utils.eval_utils.aggregation.num_evals_done(all_scores)[source]#

cellmap_segmentation_challenge.utils.eval_utils.aggregation.sanitize_scores(scores)[source]#

Sanitize scores by converting NaN values to None.

Parameters:: scores (dict) – A dictionary of scores.
Returns:: A sanitized dictionary of scores.
Return type:: dict

cellmap_segmentation_challenge.utils.eval_utils.aggregation.update_scores(scores, results, result_file, instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'])[source]#

cellmap_segmentation_challenge.utils.eval_utils.array_utils module#

Array manipulation utilities for evaluation.

cellmap_segmentation_challenge.utils.eval_utils.array_utils.resize_array(arr, target_shape, pad_value=0)[source]#

Resize an array to a target shape by padding or cropping as needed.

Parameters:

arr (np.ndarray) – Input array to resize.
target_shape (tuple) – Desired shape for the output array.
pad_value (int, float, etc.) – Value to use for padding if the array is smaller than the target shape.

Returns:

Resized array with the specified target shape.

Return type:

np.ndarray

cellmap_segmentation_challenge.utils.eval_utils.config module#

Configuration for evaluation pipeline.

class cellmap_segmentation_challenge.utils.eval_utils.config.EvaluationConfig(*args: Any, max_instance_threads: Any = None, max_semantic_threads: Any = None, **kwargs: Any)[source]#

Bases: object

Configuration for evaluation pipeline.

All parameters can be set via environment variables or passed directly. Environment variables take precedence over defaults but not over explicitly passed values.

The legacy keyword arguments max_instance_threads and max_semantic_threads are accepted for backward constructor compatibility and map to max_workers. When both max_workers and a legacy argument are provided, max_workers takes precedence.

Parameters:

args (Any)
max_instance_threads (Any)
max_semantic_threads (Any)
kwargs (Any)

max_workers: int = 4#

per_instance_threads: int = 25#

max_distance_cap_eps: float = 0.0001#

final_instance_ratio_cutoff: float = 10.0#

initial_instance_ratio_cutoff: float = 50.0#

instance_ratio_factor: float = 5.0#

max_overlap_edges: int = 5000000#

mcmf_cost_scale: int = 1000000#

truth_path: UPath#

instance_classes: list[str]#

cast_to_none: list[Any]#

property max_instance_threads: int#

use max_workers instead.

Type:: Deprecated

property max_semantic_threads: int#

use max_workers instead.

Type:: Deprecated

classmethod from_env() → EvaluationConfig[source]#

Load configuration from environment variables with defaults.

MAX_WORKERS takes precedence. When it is unset the legacy vars MAX_INSTANCE_THREADS and MAX_SEMANTIC_THREADS are consulted in that order as a fallback (each triggers a DeprecationWarning). If none of the three is set the computed default is used.

Returns:: EvaluationConfig with values from environment or defaults.
Return type:: EvaluationConfig

validate() → None[source]#

Validate configuration values.

Raises:: ValueError – If any configuration value is invalid.
Return type:: None

cellmap_segmentation_challenge.utils.eval_utils.config.ratio_cutoff(nG: int, R_base: float = 10.0, R_extra: float = 50.0, k: float = 5.0) → float[source]#

Calculate the acceptable ratio cutoff for instance matching.

The ratio cutoff decreases exponentially as the number of ground truth instances increases, allowing for more tolerance with fewer instances.

Parameters:

nG (int) – Number of ground truth instances
R_base (float) – Base ratio cutoff (minimum)
R_extra (float) – Extra ratio tolerance for small nG
k (float) – Exponential decay factor

Returns:

Maximum acceptable ratio of predicted to ground truth instances

Return type:

float

cellmap_segmentation_challenge.utils.eval_utils.distance module#

Distance metrics including Hausdorff distance computation.

cellmap_segmentation_challenge.utils.eval_utils.distance.compute_max_distance(voxel_size, shape) → float[source]#

Compute the maximum distance used for distance-based metrics, based on the maximum distance to a volume boundary.

Return type:: float

cellmap_segmentation_challenge.utils.eval_utils.distance.normalize_distance(distance: float, voxel_size) → float[source]#

Normalize a distance value to [0, 1] using the maximum distance represented by a voxel

Parameters:: distance (float)
Return type:: float

cellmap_segmentation_challenge.utils.eval_utils.distance.optimized_hausdorff_distances(truth_label, pred_label, voxel_size, hausdorff_distance_max, method='standard', percentile: float | None = None)[source]#

Compute per-truth-instance Hausdorff-like distances against the (already remapped) prediction using multithreading. Returns a 1D float32 numpy array whose i-th entry corresponds to truth_ids[i].

Parameters:

truth_label (np.ndarray) – Ground-truth instance label volume (0 == background).
pred_label (np.ndarray) – Prediction instance label volume that has already been remapped to align with the GT ids (0 == background).
voxel_size (Sequence[float]) – Physical voxel sizes in Z, Y, X (or Y, X) order.
hausdorff_distance_max (float) – Cap for distances (use np.inf for uncapped).
method ({"standard", "modified", "percentile"}) –
“standard” -> classic Hausdorff (max of directed maxima) “modified” -> mean of directed distances, then max of the two means “percentile” -> use the given percentile of directed distances (requires

percentile to be provided).
percentile (float | None) – Percentile (0-100) used when method==”percentile”.

cellmap_segmentation_challenge.utils.eval_utils.distance.bbox_for_label(stats: StatisticsDict | StatisticsSlicesDict, ndim: int, label_id: int)[source]#

Try to get bbox without allocating a full boolean mask using cc3d statistics. Falls back to mask-based bbox if cc3d doesn’t provide expected fields. Returns (mins, maxs) inclusive-exclusive in voxel indices, or None if missing.

Parameters:

stats (StatisticsDict | StatisticsSlicesDict)
ndim (int)
label_id (int)

cellmap_segmentation_challenge.utils.eval_utils.distance.roi_slices_for_pair(truth_stats: StatisticsDict | StatisticsSlicesDict, pred_stats: StatisticsDict | StatisticsSlicesDict, tid: int, voxel_size, ndim: int, shape: tuple[int, ...], max_distance: float)[source]#

ROI = union(bbox(truth==tid), bbox(pred==tid)) padded by P derived from max_distance. Returns tuple of slices suitable for numpy indexing.

Parameters:

truth_stats (StatisticsDict | StatisticsSlicesDict)
pred_stats (StatisticsDict | StatisticsSlicesDict)
tid (int)
ndim (int)
shape (tuple[int, ...])
max_distance (float)

cellmap_segmentation_challenge.utils.eval_utils.distance.compute_hausdorff_distance_roi(truth_label: ndarray, truth_stats: StatisticsDict | StatisticsSlicesDict, pred_label: ndarray, pred_stats: StatisticsDict | StatisticsSlicesDict, tid: int, voxel_size, max_distance: float, method: str = 'standard', percentile: float | None = None)[source]#

Same metric as compute_hausdorff_distance(), but operates on an ROI slice and builds masks only inside ROI.

Parameters:

truth_label (ndarray)
truth_stats (StatisticsDict | StatisticsSlicesDict)
pred_label (ndarray)
pred_stats (StatisticsDict | StatisticsSlicesDict)
tid (int)
max_distance (float)
method (str)
percentile (float | None)

cellmap_segmentation_challenge.utils.eval_utils.exceptions module#

Custom exceptions for evaluation pipeline.

exception cellmap_segmentation_challenge.utils.eval_utils.exceptions.EvaluationError[source]#

Bases: Exception

Base exception for evaluation errors.

exception cellmap_segmentation_challenge.utils.eval_utils.exceptions.TooManyInstancesError(n_pred: int, n_gt: int, ratio: float, cutoff: float)[source]#

Bases: EvaluationError

Raised when submission has too many instances relative to ground truth.

This is a pathological case where the ratio of predicted to ground truth instances exceeds acceptable thresholds, likely indicating poor segmentation.

Parameters:

n_pred (int)
n_gt (int)
ratio (float)
cutoff (float)

exception cellmap_segmentation_challenge.utils.eval_utils.exceptions.TooManyOverlapEdgesError(n_edges: int, max_edges: int)[source]#

Bases: EvaluationError

Raised when instance matching produces too many overlap edges.

This indicates computational infeasibility for the matching algorithm.

Parameters:

n_edges (int)
max_edges (int)

exception cellmap_segmentation_challenge.utils.eval_utils.exceptions.MatchingFailedError(status: int)[source]#

Bases: EvaluationError

Raised when instance matching optimization fails.

Parameters:: status (int)

exception cellmap_segmentation_challenge.utils.eval_utils.exceptions.ValidationError[source]#

Bases: EvaluationError

Raised when input validation fails.

cellmap_segmentation_challenge.utils.eval_utils.instance_matching module#

Instance matching using min-cost flow optimization.

class cellmap_segmentation_challenge.utils.eval_utils.instance_matching.InstanceOverlapData(nG: int, nP: int, rows: ndarray, cols: ndarray, iou_vals: ndarray)[source]#

Bases: object

Data structure for instance overlap computation.

Parameters:

nG (int)
nP (int)
rows (ndarray)
cols (ndarray)
iou_vals (ndarray)

nG: int#

nP: int#

rows: ndarray#

cols: ndarray#

iou_vals: ndarray#

cellmap_segmentation_challenge.utils.eval_utils.instance_matching.match_instances(gt: ndarray, pred: ndarray, config: EvaluationConfig | None = None) → dict[int, int][source]#

Match instances between GT and prediction based on IoU.

Uses min-cost flow optimization to find optimal 1:1 matching between predicted and ground truth instances based on IoU overlap.

Parameters:

gt (ndarray) – Ground truth instance labels (0 = background)
pred (ndarray) – Predicted instance labels (0 = background)
config (EvaluationConfig | None) – Evaluation configuration (uses defaults if None)

Returns:

Dictionary mapping predicted instance ID to ground truth instance ID. Returns {0: 0} if only background present. Returns {} if no matches found or one side has no instances.

Raises:

ValidationError – If array shapes don’t match
TooManyInstancesError – If pred/GT ratio exceeds threshold
TooManyOverlapEdgesError – If overlap computation is too large
MatchingFailedError – If optimization fails

Return type:

dict[int, int]

Example

>>> mapping = match_instances(gt, pred)
>>> # Remap predictions to match GT IDs
>>> pred_aligned = remap(pred, mapping, preserve_missing_labels=True)

cellmap_segmentation_challenge.utils.eval_utils.scoring module#

Core scoring functions for segmentation evaluation.

cellmap_segmentation_challenge.utils.eval_utils.scoring.score_instance(pred_label, truth_label, voxel_size, hausdorff_distance_max=None, config: EvaluationConfig | None = None) → InstanceScoreDict[source]#

Score instance segmentation against ground truth.

Computes instance F1 score, Hausdorff distance, and combined metrics after optimal instance matching.

Parameters:

pred_label – Predicted instance labels (0 = background)
truth_label – Ground truth instance labels (0 = background)
voxel_size – Physical voxel size in (Z, Y, X) order
hausdorff_distance_max – Maximum Hausdorff distance cap (None = auto)
config (EvaluationConfig | None) – Evaluation configuration (uses defaults if None)

Returns:

Dictionary containing all instance segmentation metrics

Return type:

InstanceScoreDict

Example

>>> scores = score_instance(pred, truth, voxel_size=(4.0, 4.0, 4.0))
>>> print(f"Combined score: {scores['combined_score']:.3f}")

cellmap_segmentation_challenge.utils.eval_utils.scoring.score_semantic(pred_label, truth_label) → dict[str, float][source]#

Score a single semantic label volume against the ground truth semantic label volume.

Parameters:

pred_label (np.ndarray) – The predicted semantic label volume.
truth_label (np.ndarray) – The ground truth semantic label volume.

Returns:

A dictionary of scores for the semantic label volume.

Return type:

dict

Example usage:: scores = score_semantic(pred_label, truth_label)

cellmap_segmentation_challenge.utils.eval_utils.scoring.score_label(pred_label_path, label_name, crop_name, truth_path='/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/data/ground_truth.zarr', instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'])[source]#

Score a single label volume against the ground truth label volume.

Parameters:

pred_label_path (str) – The path to the predicted label volume.
truth_path (str) – The path to the ground truth label volume.
instance_classes (list) – A list of instance classes.

Returns:

A dictionary of scores for the label volume.

Return type:

dict

Example usage:: scores = score_label(‘pred.zarr/test_volume/label1’)

cellmap_segmentation_challenge.utils.eval_utils.scoring.empty_label_score(label, crop_name, instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'], truth_path='/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/data/ground_truth.zarr')[source]#

cellmap_segmentation_challenge.utils.eval_utils.scoring.match_crop_space(path, class_label, voxel_size, shape, translation) → ndarray[source]#

Return type:: ndarray

cellmap_segmentation_challenge.utils.eval_utils.submission module#

Submission processing and main evaluation entry point.

cellmap_segmentation_challenge.utils.eval_utils.submission.ensure_zgroup(path: UPath) → Group | Array[source]#

Ensure that the given path can be opened as a zarr Group. If a .zgroup is not present, add it.

Parameters:: path (UPath)
Return type:: Group | Array

cellmap_segmentation_challenge.utils.eval_utils.submission.get_evaluation_args(volumes, submission_path, truth_path='/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/data/ground_truth.zarr', instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance']) → list[tuple][source]#

Get the arguments for scoring each label in the submission. :param volumes: A list of volumes to score. :type volumes: list :param submission_path: The path to the submission volume. :type submission_path: str :param truth_path: The path to the ground truth volume. :type truth_path: str :param instance_classes: A list of instance classes. :type instance_classes: list

Returns:: A list of tuples containing the arguments for each label to be scored.
Return type:: list[tuple]

cellmap_segmentation_challenge.utils.eval_utils.submission.missing_volume_score(truth_path, volume, instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance']) → dict[source]#

Score a missing volume as 0’s, congruent with the score_volume function.

Parameters:

truth_path (str) – The path to the ground truth volume.
volume (str) – The name of the volume.
instance_classes (list) – A list of instance classes.

Returns:

A dictionary of scores for the volume.

Return type:

dict

Example usage:: scores = missing_volume_score(‘truth.zarr/test_volume’)

cellmap_segmentation_challenge.utils.eval_utils.submission.ensure_valid_submission(submission_path: UPath)[source]#

Ensure that the unzipped submission path is a valid Zarr-2 file.

Parameters:: submission_path (str) – The path to the unzipped submission Zarr-2 file.
Raises:: ValueError – If the submission is not a valid unzipped Zarr-2 file.

cellmap_segmentation_challenge.utils.eval_utils.submission.score_submission(submission_path='/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/data/submission.zip', result_file=None, truth_path='/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/data/ground_truth.zarr', instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'], config: EvaluationConfig | None = None)[source]#

Score a submission against the ground truth data.

This is the main entry point for evaluating a submission. It unzips, validates, scores, and aggregates results for all volumes.

Parameters:

submission_path – Path to the zipped submission Zarr-2 file
result_file – Path to save the scores (None to skip saving)
truth_path – Path to the ground truth Zarr-2 file
instance_classes – List of instance segmentation classes
config (EvaluationConfig | None) – Evaluation configuration (uses defaults if None)

Returns:

Dictionary of aggregated scores across all volumes

Raises:

ValueError – If submission format is invalid
RuntimeError – If volume renaming fails

Example

>>> scores = score_submission('submission.zip', 'results.json')
>>> print(f"Overall score: {scores['overall_score']:.4f}")

Results structure:

{

“cropN”: { # Per-volume scores

“label_name”: {: # Instance segmentation “f1”: float, “hausdorff_distance”: float, “combined_score”: float, # OR semantic segmentation “iou”: float, “dice_score”: float,

}

}, “label_scores”: { # Aggregated per-label

“label_name”: {…}

}, “overall_instance_score”: float, “overall_semantic_score”: float, “overall_score”: float,

}

cellmap_segmentation_challenge.utils.eval_utils.types module#

Type definitions for evaluation metrics.

class cellmap_segmentation_challenge.utils.eval_utils.types.InstanceScoreDict[source]#

Bases: TypedDict

Type definition for instance segmentation scores.

f1: float#

tp: int#

fp: int#

fn: int#

binary_accuracy: float#

hausdorff_distance: float#

normalized_hausdorff_distance: float#

combined_score: float#

iou: float#

dice_score: float#

num_voxels: int#

voxel_size: tuple[float, ...]#

is_missing: bool#

status: Literal['scored', 'skipped_too_many_instances', 'missing', 'matching_failed']#

class cellmap_segmentation_challenge.utils.eval_utils.types.SemanticScoreDict[source]#

Bases: TypedDict

Type definition for semantic segmentation scores.

iou: float#

dice_score: float#

binary_accuracy: float#

num_voxels: int#

voxel_size: tuple[float, ...]#

is_missing: bool#

status: Literal['scored', 'missing']#

cellmap_segmentation_challenge.utils.eval_utils.zip_utils module#

Zip file handling utilities for submissions.

cellmap_segmentation_challenge.utils.eval_utils.zip_utils.unzip_file(zip_path, max_uncompressed_size: int = 53687091200)[source]#

Unzip a zip file to a specified directory.

Validates against path traversal (zip slip), symlink attacks, and decompression bombs before extracting.

Parameters:

zip_path (str) – The path to the zip file.
max_uncompressed_size (int) – Maximum total uncompressed size in bytes.

Raises:

ValueError – If zip_path is a directory rather than a zip file.
ValidationError – If any member fails security checks or total size exceeds limit.

Example usage:: unzip_file(‘submission.zip’)

Module contents#

Evaluation utilities for cellmap segmentation challenge.

This module provides all evaluation-related functionality including: - Scoring for instance and semantic segmentation - Instance matching using min-cost flow optimization - Hausdorff distance and other metrics - Submission processing and validation

class cellmap_segmentation_challenge.utils.eval_utils.InstanceScoreDict[source]#

Bases: TypedDict

Type definition for instance segmentation scores.

f1: float#

tp: int#

fp: int#

fn: int#

binary_accuracy: float#

hausdorff_distance: float#

normalized_hausdorff_distance: float#

combined_score: float#

iou: float#

dice_score: float#

num_voxels: int#

voxel_size: tuple[float, ...]#

is_missing: bool#

status: Literal['scored', 'skipped_too_many_instances', 'missing', 'matching_failed']#

class cellmap_segmentation_challenge.utils.eval_utils.SemanticScoreDict[source]#

Bases: TypedDict

Type definition for semantic segmentation scores.

iou: float#

dice_score: float#

binary_accuracy: float#

num_voxels: int#

voxel_size: tuple[float, ...]#

is_missing: bool#

status: Literal['scored', 'missing']#

exception cellmap_segmentation_challenge.utils.eval_utils.EvaluationError[source]#

Bases: Exception

Base exception for evaluation errors.

exception cellmap_segmentation_challenge.utils.eval_utils.TooManyInstancesError(n_pred: int, n_gt: int, ratio: float, cutoff: float)[source]#

Bases: EvaluationError

Raised when submission has too many instances relative to ground truth.

This is a pathological case where the ratio of predicted to ground truth instances exceeds acceptable thresholds, likely indicating poor segmentation.

Parameters:

n_pred (int)
n_gt (int)
ratio (float)
cutoff (float)

exception cellmap_segmentation_challenge.utils.eval_utils.TooManyOverlapEdgesError(n_edges: int, max_edges: int)[source]#

Bases: EvaluationError

Raised when instance matching produces too many overlap edges.

This indicates computational infeasibility for the matching algorithm.

Parameters:

n_edges (int)
max_edges (int)

exception cellmap_segmentation_challenge.utils.eval_utils.MatchingFailedError(status: int)[source]#

Bases: EvaluationError

Raised when instance matching optimization fails.

Parameters:: status (int)

exception cellmap_segmentation_challenge.utils.eval_utils.ValidationError[source]#

Bases: EvaluationError

Raised when input validation fails.

class cellmap_segmentation_challenge.utils.eval_utils.EvaluationConfig(*args: Any, max_instance_threads: Any = None, max_semantic_threads: Any = None, **kwargs: Any)[source]#

Bases: object

Configuration for evaluation pipeline.

All parameters can be set via environment variables or passed directly. Environment variables take precedence over defaults but not over explicitly passed values.

Parameters:

args (Any)
max_instance_threads (Any)
max_semantic_threads (Any)
kwargs (Any)

max_workers: int = 4#

per_instance_threads: int = 25#

max_distance_cap_eps: float = 0.0001#

final_instance_ratio_cutoff: float = 10.0#

initial_instance_ratio_cutoff: float = 50.0#

instance_ratio_factor: float = 5.0#

max_overlap_edges: int = 5000000#

mcmf_cost_scale: int = 1000000#

truth_path: UPath#

instance_classes: list[str]#

cast_to_none: list[Any]#

property max_instance_threads: int#

use max_workers instead.

Type:: Deprecated

property max_semantic_threads: int#

use max_workers instead.

Type:: Deprecated

classmethod from_env() → EvaluationConfig[source]#

Load configuration from environment variables with defaults.

Returns:: EvaluationConfig with values from environment or defaults.
Return type:: EvaluationConfig

validate() → None[source]#

Validate configuration values.

Raises:: ValueError – If any configuration value is invalid.
Return type:: None

cellmap_segmentation_challenge.utils.eval_utils.ratio_cutoff(nG: int, R_base: float = 10.0, R_extra: float = 50.0, k: float = 5.0) → float[source]#

Calculate the acceptable ratio cutoff for instance matching.

The ratio cutoff decreases exponentially as the number of ground truth instances increases, allowing for more tolerance with fewer instances.

Parameters:

nG (int) – Number of ground truth instances
R_base (float) – Base ratio cutoff (minimum)
R_extra (float) – Extra ratio tolerance for small nG
k (float) – Exponential decay factor

Returns:

Maximum acceptable ratio of predicted to ground truth instances

Return type:

float

class cellmap_segmentation_challenge.utils.eval_utils.InstanceOverlapData(nG: int, nP: int, rows: ndarray, cols: ndarray, iou_vals: ndarray)[source]#

Bases: object

Data structure for instance overlap computation.

Parameters:

nG (int)
nP (int)
rows (ndarray)
cols (ndarray)
iou_vals (ndarray)

nG: int#

nP: int#

rows: ndarray#

cols: ndarray#

iou_vals: ndarray#

cellmap_segmentation_challenge.utils.eval_utils.match_instances(gt: ndarray, pred: ndarray, config: EvaluationConfig | None = None) → dict[int, int][source]#

Match instances between GT and prediction based on IoU.

Uses min-cost flow optimization to find optimal 1:1 matching between predicted and ground truth instances based on IoU overlap.

Parameters:

gt (ndarray) – Ground truth instance labels (0 = background)
pred (ndarray) – Predicted instance labels (0 = background)
config (EvaluationConfig | None) – Evaluation configuration (uses defaults if None)

Returns:

Dictionary mapping predicted instance ID to ground truth instance ID. Returns {0: 0} if only background present. Returns {} if no matches found or one side has no instances.

Raises:

ValidationError – If array shapes don’t match
TooManyInstancesError – If pred/GT ratio exceeds threshold
TooManyOverlapEdgesError – If overlap computation is too large
MatchingFailedError – If optimization fails

Return type:

dict[int, int]

Example

>>> mapping = match_instances(gt, pred)
>>> # Remap predictions to match GT IDs
>>> pred_aligned = remap(pred, mapping, preserve_missing_labels=True)

cellmap_segmentation_challenge.utils.eval_utils.compute_max_distance(voxel_size, shape) → float[source]#

Compute the maximum distance used for distance-based metrics, based on the maximum distance to a volume boundary.

Return type:: float

cellmap_segmentation_challenge.utils.eval_utils.normalize_distance(distance: float, voxel_size) → float[source]#

Normalize a distance value to [0, 1] using the maximum distance represented by a voxel

Parameters:: distance (float)
Return type:: float

cellmap_segmentation_challenge.utils.eval_utils.optimized_hausdorff_distances(truth_label, pred_label, voxel_size, hausdorff_distance_max, method='standard', percentile: float | None = None)[source]#

Compute per-truth-instance Hausdorff-like distances against the (already remapped) prediction using multithreading. Returns a 1D float32 numpy array whose i-th entry corresponds to truth_ids[i].

Parameters:

truth_label (np.ndarray) – Ground-truth instance label volume (0 == background).
pred_label (np.ndarray) – Prediction instance label volume that has already been remapped to align with the GT ids (0 == background).
voxel_size (Sequence[float]) – Physical voxel sizes in Z, Y, X (or Y, X) order.
hausdorff_distance_max (float) – Cap for distances (use np.inf for uncapped).
method ({"standard", "modified", "percentile"}) –
“standard” -> classic Hausdorff (max of directed maxima) “modified” -> mean of directed distances, then max of the two means “percentile” -> use the given percentile of directed distances (requires

percentile to be provided).
percentile (float | None) – Percentile (0-100) used when method==”percentile”.

cellmap_segmentation_challenge.utils.eval_utils.bbox_for_label(stats: StatisticsDict | StatisticsSlicesDict, ndim: int, label_id: int)[source]#

Parameters:

stats (StatisticsDict | StatisticsSlicesDict)
ndim (int)
label_id (int)

cellmap_segmentation_challenge.utils.eval_utils.roi_slices_for_pair(truth_stats: StatisticsDict | StatisticsSlicesDict, pred_stats: StatisticsDict | StatisticsSlicesDict, tid: int, voxel_size, ndim: int, shape: tuple[int, ...], max_distance: float)[source]#

ROI = union(bbox(truth==tid), bbox(pred==tid)) padded by P derived from max_distance. Returns tuple of slices suitable for numpy indexing.

Parameters:

truth_stats (StatisticsDict | StatisticsSlicesDict)
pred_stats (StatisticsDict | StatisticsSlicesDict)
tid (int)
ndim (int)
shape (tuple[int, ...])
max_distance (float)

cellmap_segmentation_challenge.utils.eval_utils.compute_hausdorff_distance_roi(truth_label: ndarray, truth_stats: StatisticsDict | StatisticsSlicesDict, pred_label: ndarray, pred_stats: StatisticsDict | StatisticsSlicesDict, tid: int, voxel_size, max_distance: float, method: str = 'standard', percentile: float | None = None)[source]#

Same metric as compute_hausdorff_distance(), but operates on an ROI slice and builds masks only inside ROI.

Parameters:

truth_label (ndarray)
truth_stats (StatisticsDict | StatisticsSlicesDict)
pred_label (ndarray)
pred_stats (StatisticsDict | StatisticsSlicesDict)
tid (int)
max_distance (float)
method (str)
percentile (float | None)

cellmap_segmentation_challenge.utils.eval_utils.score_instance(pred_label, truth_label, voxel_size, hausdorff_distance_max=None, config: EvaluationConfig | None = None) → InstanceScoreDict[source]#

Score instance segmentation against ground truth.

Computes instance F1 score, Hausdorff distance, and combined metrics after optimal instance matching.

Parameters:

pred_label – Predicted instance labels (0 = background)
truth_label – Ground truth instance labels (0 = background)
voxel_size – Physical voxel size in (Z, Y, X) order
hausdorff_distance_max – Maximum Hausdorff distance cap (None = auto)
config (EvaluationConfig | None) – Evaluation configuration (uses defaults if None)

Returns:

Dictionary containing all instance segmentation metrics

Return type:

InstanceScoreDict

Example

>>> scores = score_instance(pred, truth, voxel_size=(4.0, 4.0, 4.0))
>>> print(f"Combined score: {scores['combined_score']:.3f}")

cellmap_segmentation_challenge.utils.eval_utils.score_semantic(pred_label, truth_label) → dict[str, float][source]#

Score a single semantic label volume against the ground truth semantic label volume.

Parameters:

pred_label (np.ndarray) – The predicted semantic label volume.
truth_label (np.ndarray) – The ground truth semantic label volume.

Returns:

A dictionary of scores for the semantic label volume.

Return type:

dict

Example usage:: scores = score_semantic(pred_label, truth_label)

cellmap_segmentation_challenge.utils.eval_utils.score_label(pred_label_path, label_name, crop_name, truth_path='/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/data/ground_truth.zarr', instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'])[source]#

Score a single label volume against the ground truth label volume.

Parameters:

pred_label_path (str) – The path to the predicted label volume.
truth_path (str) – The path to the ground truth label volume.
instance_classes (list) – A list of instance classes.

Returns:

A dictionary of scores for the label volume.

Return type:

dict

Example usage:: scores = score_label(‘pred.zarr/test_volume/label1’)

cellmap_segmentation_challenge.utils.eval_utils.empty_label_score(label, crop_name, instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'], truth_path='/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/data/ground_truth.zarr')[source]#

cellmap_segmentation_challenge.utils.eval_utils.match_crop_space(path, class_label, voxel_size, shape, translation) → ndarray[source]#

Return type:: ndarray

cellmap_segmentation_challenge.utils.eval_utils.combine_scores(scores, include_missing=True, instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'], cast_to_none=[nan, inf, -inf, inf, -inf])[source]#

Combine scores across volumes, normalizing by the number of voxels.

Parameters:

scores (dict) – A dictionary of scores for each volume, as returned by score_volume.
include_missing (bool) – Whether to include missing volumes in the combined scores.
instance_classes (list) – A list of instance classes.
cast_to_none (list) – A list of values to cast to None in the combined scores.

Returns:

A dictionary of combined scores across all volumes.

Return type:

dict

Example usage:: combined_scores = combine_scores(scores)

cellmap_segmentation_challenge.utils.eval_utils.sanitize_scores(scores)[source]#

Sanitize scores by converting NaN values to None.

Parameters:: scores (dict) – A dictionary of scores.
Returns:: A sanitized dictionary of scores.
Return type:: dict

cellmap_segmentation_challenge.utils.eval_utils.update_scores(scores, results, result_file, instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'])[source]#

cellmap_segmentation_challenge.utils.eval_utils.num_evals_done(all_scores)[source]#

cellmap_segmentation_challenge.utils.eval_utils.ensure_zgroup(path: UPath) → Group | Array[source]#

Ensure that the given path can be opened as a zarr Group. If a .zgroup is not present, add it.

Parameters:: path (UPath)
Return type:: Group | Array

cellmap_segmentation_challenge.utils.eval_utils.ensure_valid_submission(submission_path: UPath)[source]#

Ensure that the unzipped submission path is a valid Zarr-2 file.

Parameters:: submission_path (str) – The path to the unzipped submission Zarr-2 file.
Raises:: ValueError – If the submission is not a valid unzipped Zarr-2 file.

cellmap_segmentation_challenge.utils.eval_utils.get_evaluation_args(volumes, submission_path, truth_path='/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/data/ground_truth.zarr', instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance']) → list[tuple][source]#

Returns:: A list of tuples containing the arguments for each label to be scored.
Return type:: list[tuple]

cellmap_segmentation_challenge.utils.eval_utils.missing_volume_score(truth_path, volume, instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance']) → dict[source]#

Score a missing volume as 0’s, congruent with the score_volume function.

Parameters:

truth_path (str) – The path to the ground truth volume.
volume (str) – The name of the volume.
instance_classes (list) – A list of instance classes.

Returns:

A dictionary of scores for the volume.

Return type:

dict

Example usage:: scores = missing_volume_score(‘truth.zarr/test_volume’)

cellmap_segmentation_challenge.utils.eval_utils.score_submission(submission_path='/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/data/submission.zip', result_file=None, truth_path='/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/data/ground_truth.zarr', instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'], config: EvaluationConfig | None = None)[source]#

Score a submission against the ground truth data.

This is the main entry point for evaluating a submission. It unzips, validates, scores, and aggregates results for all volumes.

Parameters:

submission_path – Path to the zipped submission Zarr-2 file
result_file – Path to save the scores (None to skip saving)
truth_path – Path to the ground truth Zarr-2 file
instance_classes – List of instance segmentation classes
config (EvaluationConfig | None) – Evaluation configuration (uses defaults if None)

Returns:

Dictionary of aggregated scores across all volumes

Raises:

ValueError – If submission format is invalid
RuntimeError – If volume renaming fails

Example

>>> scores = score_submission('submission.zip', 'results.json')
>>> print(f"Overall score: {scores['overall_score']:.4f}")

Results structure:

{

“cropN”: { # Per-volume scores

“label_name”: {: # Instance segmentation “f1”: float, “hausdorff_distance”: float, “combined_score”: float, # OR semantic segmentation “iou”: float, “dice_score”: float,

}

}, “label_scores”: { # Aggregated per-label

“label_name”: {…}

}, “overall_instance_score”: float, “overall_semantic_score”: float, “overall_score”: float,

}

cellmap_segmentation_challenge.utils.eval_utils.resize_array(arr, target_shape, pad_value=0)[source]#

Resize an array to a target shape by padding or cropping as needed.

Parameters:

arr (np.ndarray) – Input array to resize.
target_shape (tuple) – Desired shape for the output array.
pad_value (int, float, etc.) – Value to use for padding if the array is smaller than the target shape.

Returns:

Resized array with the specified target shape.

Return type:

np.ndarray

cellmap_segmentation_challenge.utils.eval_utils.unzip_file(zip_path, max_uncompressed_size: int = 53687091200)[source]#

Unzip a zip file to a specified directory.

Validates against path traversal (zip slip), symlink attacks, and decompression bombs before extracting.

Parameters:

zip_path (str) – The path to the zip file.
max_uncompressed_size (int) – Maximum total uncompressed size in bytes.

Raises:

ValueError – If zip_path is a directory rather than a zip file.
ValidationError – If any member fails security checks or total size exceeds limit.

Example usage:: unzip_file(‘submission.zip’)

cellmap_segmentation_challenge.utils.eval_utils package

Contents

cellmap_segmentation_challenge.utils.eval_utils package#

Submodules#

cellmap_segmentation_challenge.utils.eval_utils.aggregation module#

cellmap_segmentation_challenge.utils.eval_utils.array_utils module#

cellmap_segmentation_challenge.utils.eval_utils.config module#

cellmap_segmentation_challenge.utils.eval_utils.distance module#

cellmap_segmentation_challenge.utils.eval_utils.exceptions module#

cellmap_segmentation_challenge.utils.eval_utils.instance_matching module#

cellmap_segmentation_challenge.utils.eval_utils.scoring module#

cellmap_segmentation_challenge.utils.eval_utils.submission module#

cellmap_segmentation_challenge.utils.eval_utils.types module#

cellmap_segmentation_challenge.utils.eval_utils.zip_utils module#

Module contents#