cellmap_segmentation_challenge.utils.eval_utils package

Contents

cellmap_segmentation_challenge.utils.eval_utils package#

Submodules#

cellmap_segmentation_challenge.utils.eval_utils.aggregation module#

Score aggregation and result processing utilities.

cellmap_segmentation_challenge.utils.eval_utils.aggregation.combine_scores(scores, include_missing=True, instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'], cast_to_none=[nan, inf, -inf, inf, -inf])[source]#

Combine scores across volumes, normalizing by the number of voxels.

Parameters:
  • scores (dict) – A dictionary of scores for each volume, as returned by score_volume.

  • include_missing (bool) – Whether to include missing volumes in the combined scores.

  • instance_classes (list) – A list of instance classes.

  • cast_to_none (list) – A list of values to cast to None in the combined scores.

Returns:

A dictionary of combined scores across all volumes.

Return type:

dict

Example usage:

combined_scores = combine_scores(scores)

cellmap_segmentation_challenge.utils.eval_utils.aggregation.num_evals_done(all_scores)[source]#
cellmap_segmentation_challenge.utils.eval_utils.aggregation.sanitize_scores(scores)[source]#

Sanitize scores by converting NaN values to None.

Parameters:

scores (dict) – A dictionary of scores.

Returns:

A sanitized dictionary of scores.

Return type:

dict

cellmap_segmentation_challenge.utils.eval_utils.aggregation.update_scores(scores, results, result_file, instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'])[source]#

cellmap_segmentation_challenge.utils.eval_utils.array_utils module#

Array manipulation utilities for evaluation.

cellmap_segmentation_challenge.utils.eval_utils.array_utils.resize_array(arr, target_shape, pad_value=0)[source]#

Resize an array to a target shape by padding or cropping as needed.

Parameters:
  • arr (np.ndarray) – Input array to resize.

  • target_shape (tuple) – Desired shape for the output array.

  • pad_value (int, float, etc.) – Value to use for padding if the array is smaller than the target shape.

Returns:

Resized array with the specified target shape.

Return type:

np.ndarray

cellmap_segmentation_challenge.utils.eval_utils.config module#

Configuration for evaluation pipeline.

class cellmap_segmentation_challenge.utils.eval_utils.config.EvaluationConfig(max_instance_threads: int = 3, max_semantic_threads: int = 25, per_instance_threads: int = 25, max_distance_cap_eps: float = 0.0001, final_instance_ratio_cutoff: float = 10.0, initial_instance_ratio_cutoff: float = 50.0, instance_ratio_factor: float = 5.0, max_overlap_edges: int = 5000000, mcmf_cost_scale: int = 1000000, truth_path: UPath = <factory>, instance_classes: list[str] = <factory>, cast_to_none: list[Any] = <factory>)[source]#

Bases: object

Configuration for evaluation pipeline.

All parameters can be set via environment variables or passed directly. Environment variables take precedence over defaults but not over explicitly passed values.

Parameters:
  • max_instance_threads (int)

  • max_semantic_threads (int)

  • per_instance_threads (int)

  • max_distance_cap_eps (float)

  • final_instance_ratio_cutoff (float)

  • initial_instance_ratio_cutoff (float)

  • instance_ratio_factor (float)

  • max_overlap_edges (int)

  • mcmf_cost_scale (int)

  • truth_path (UPath)

  • instance_classes (list[str])

  • cast_to_none (list[Any])

max_instance_threads: int = 3#
max_semantic_threads: int = 25#
per_instance_threads: int = 25#
max_distance_cap_eps: float = 0.0001#
final_instance_ratio_cutoff: float = 10.0#
initial_instance_ratio_cutoff: float = 50.0#
instance_ratio_factor: float = 5.0#
max_overlap_edges: int = 5000000#
mcmf_cost_scale: int = 1000000#
truth_path: UPath#
instance_classes: list[str]#
cast_to_none: list[Any]#
classmethod from_env() EvaluationConfig[source]#

Load configuration from environment variables with defaults.

Returns:

EvaluationConfig with values from environment or defaults.

Return type:

EvaluationConfig

validate() None[source]#

Validate configuration values.

Raises:

ValueError – If any configuration value is invalid.

Return type:

None

cellmap_segmentation_challenge.utils.eval_utils.config.ratio_cutoff(nG: int, R_base: float = 10.0, R_extra: float = 50.0, k: float = 5.0) float[source]#

Calculate the acceptable ratio cutoff for instance matching.

The ratio cutoff decreases exponentially as the number of ground truth instances increases, allowing for more tolerance with fewer instances.

Parameters:
  • nG (int) – Number of ground truth instances

  • R_base (float) – Base ratio cutoff (minimum)

  • R_extra (float) – Extra ratio tolerance for small nG

  • k (float) – Exponential decay factor

Returns:

Maximum acceptable ratio of predicted to ground truth instances

Return type:

float

cellmap_segmentation_challenge.utils.eval_utils.distance module#

Distance metrics including Hausdorff distance computation.

cellmap_segmentation_challenge.utils.eval_utils.distance.compute_max_distance(voxel_size, shape) float[source]#

Compute the maximum distance used for distance-based metrics, based on the maximum distance to a volume boundary.

Return type:

float

cellmap_segmentation_challenge.utils.eval_utils.distance.normalize_distance(distance: float, voxel_size) float[source]#

Normalize a distance value to [0, 1] using the maximum distance represented by a voxel

Parameters:

distance (float)

Return type:

float

cellmap_segmentation_challenge.utils.eval_utils.distance.optimized_hausdorff_distances(truth_label, pred_label, voxel_size, hausdorff_distance_max, method='standard', percentile: float | None = None)[source]#

Compute per-truth-instance Hausdorff-like distances against the (already remapped) prediction using multithreading. Returns a 1D float32 numpy array whose i-th entry corresponds to truth_ids[i].

Parameters:
  • truth_label (np.ndarray) – Ground-truth instance label volume (0 == background).

  • pred_label (np.ndarray) – Prediction instance label volume that has already been remapped to align with the GT ids (0 == background).

  • voxel_size (Sequence[float]) – Physical voxel sizes in Z, Y, X (or Y, X) order.

  • hausdorff_distance_max (float) – Cap for distances (use np.inf for uncapped).

  • method ({"standard", "modified", "percentile"}) –

    “standard” -> classic Hausdorff (max of directed maxima) “modified” -> mean of directed distances, then max of the two means “percentile” -> use the given percentile of directed distances (requires

    percentile to be provided).

  • percentile (float | None) – Percentile (0-100) used when method==”percentile”.

cellmap_segmentation_challenge.utils.eval_utils.distance.bbox_for_label(stats: StatisticsDict | StatisticsSlicesDict, ndim: int, label_id: int)[source]#

Try to get bbox without allocating a full boolean mask using cc3d statistics. Falls back to mask-based bbox if cc3d doesn’t provide expected fields. Returns (mins, maxs) inclusive-exclusive in voxel indices, or None if missing.

Parameters:
  • stats (StatisticsDict | StatisticsSlicesDict)

  • ndim (int)

  • label_id (int)

cellmap_segmentation_challenge.utils.eval_utils.distance.roi_slices_for_pair(truth_stats: StatisticsDict | StatisticsSlicesDict, pred_stats: StatisticsDict | StatisticsSlicesDict, tid: int, voxel_size, ndim: int, shape: tuple[int, ...], max_distance: float)[source]#

ROI = union(bbox(truth==tid), bbox(pred==tid)) padded by P derived from max_distance. Returns tuple of slices suitable for numpy indexing.

Parameters:
  • truth_stats (StatisticsDict | StatisticsSlicesDict)

  • pred_stats (StatisticsDict | StatisticsSlicesDict)

  • tid (int)

  • ndim (int)

  • shape (tuple[int, ...])

  • max_distance (float)

cellmap_segmentation_challenge.utils.eval_utils.distance.compute_hausdorff_distance_roi(truth_label: ndarray, truth_stats: StatisticsDict | StatisticsSlicesDict, pred_label: ndarray, pred_stats: StatisticsDict | StatisticsSlicesDict, tid: int, voxel_size, max_distance: float, method: str = 'standard', percentile: float | None = None)[source]#

Same metric as compute_hausdorff_distance(), but operates on an ROI slice and builds masks only inside ROI.

Parameters:
  • truth_label (ndarray)

  • truth_stats (StatisticsDict | StatisticsSlicesDict)

  • pred_label (ndarray)

  • pred_stats (StatisticsDict | StatisticsSlicesDict)

  • tid (int)

  • max_distance (float)

  • method (str)

  • percentile (float | None)

cellmap_segmentation_challenge.utils.eval_utils.exceptions module#

Custom exceptions for evaluation pipeline.

exception cellmap_segmentation_challenge.utils.eval_utils.exceptions.EvaluationError[source]#

Bases: Exception

Base exception for evaluation errors.

exception cellmap_segmentation_challenge.utils.eval_utils.exceptions.TooManyInstancesError(n_pred: int, n_gt: int, ratio: float, cutoff: float)[source]#

Bases: EvaluationError

Raised when submission has too many instances relative to ground truth.

This is a pathological case where the ratio of predicted to ground truth instances exceeds acceptable thresholds, likely indicating poor segmentation.

Parameters:
  • n_pred (int)

  • n_gt (int)

  • ratio (float)

  • cutoff (float)

exception cellmap_segmentation_challenge.utils.eval_utils.exceptions.TooManyOverlapEdgesError(n_edges: int, max_edges: int)[source]#

Bases: EvaluationError

Raised when instance matching produces too many overlap edges.

This indicates computational infeasibility for the matching algorithm.

Parameters:
  • n_edges (int)

  • max_edges (int)

exception cellmap_segmentation_challenge.utils.eval_utils.exceptions.MatchingFailedError(status: int)[source]#

Bases: EvaluationError

Raised when instance matching optimization fails.

Parameters:

status (int)

exception cellmap_segmentation_challenge.utils.eval_utils.exceptions.ValidationError[source]#

Bases: EvaluationError

Raised when input validation fails.

cellmap_segmentation_challenge.utils.eval_utils.instance_matching module#

Instance matching using min-cost flow optimization.

class cellmap_segmentation_challenge.utils.eval_utils.instance_matching.InstanceOverlapData(nG: int, nP: int, rows: ndarray, cols: ndarray, iou_vals: ndarray)[source]#

Bases: object

Data structure for instance overlap computation.

Parameters:
  • nG (int)

  • nP (int)

  • rows (ndarray)

  • cols (ndarray)

  • iou_vals (ndarray)

nG: int#
nP: int#
rows: ndarray#
cols: ndarray#
iou_vals: ndarray#
cellmap_segmentation_challenge.utils.eval_utils.instance_matching.match_instances(gt: ndarray, pred: ndarray, config: EvaluationConfig | None = None) dict[int, int][source]#

Match instances between GT and prediction based on IoU.

Uses min-cost flow optimization to find optimal 1:1 matching between predicted and ground truth instances based on IoU overlap.

Parameters:
  • gt (ndarray) – Ground truth instance labels (0 = background)

  • pred (ndarray) – Predicted instance labels (0 = background)

  • config (EvaluationConfig | None) – Evaluation configuration (uses defaults if None)

Returns:

Dictionary mapping predicted instance ID to ground truth instance ID. Returns {0: 0} if only background present. Returns {} if no matches found or one side has no instances.

Raises:
Return type:

dict[int, int]

Example

>>> mapping = match_instances(gt, pred)
>>> # Remap predictions to match GT IDs
>>> pred_aligned = remap(pred, mapping, preserve_missing_labels=True)

cellmap_segmentation_challenge.utils.eval_utils.scoring module#

Core scoring functions for segmentation evaluation.

cellmap_segmentation_challenge.utils.eval_utils.scoring.score_instance(pred_label, truth_label, voxel_size, hausdorff_distance_max=None, config: EvaluationConfig | None = None) InstanceScoreDict[source]#

Score instance segmentation against ground truth.

Computes pixel-wise accuracy, Hausdorff distance, and combined metrics after optimal instance matching.

Parameters:
  • pred_label – Predicted instance labels (0 = background)

  • truth_label – Ground truth instance labels (0 = background)

  • voxel_size – Physical voxel size in (Z, Y, X) order

  • hausdorff_distance_max – Maximum Hausdorff distance cap (None = auto)

  • config (EvaluationConfig | None) – Evaluation configuration (uses defaults if None)

Returns:

Dictionary containing all instance segmentation metrics

Return type:

InstanceScoreDict

Example

>>> scores = score_instance(pred, truth, voxel_size=(4.0, 4.0, 4.0))
>>> print(f"Combined score: {scores['combined_score']:.3f}")
cellmap_segmentation_challenge.utils.eval_utils.scoring.score_semantic(pred_label, truth_label) dict[str, float][source]#

Score a single semantic label volume against the ground truth semantic label volume.

Parameters:
  • pred_label (np.ndarray) – The predicted semantic label volume.

  • truth_label (np.ndarray) – The ground truth semantic label volume.

Returns:

A dictionary of scores for the semantic label volume.

Return type:

dict

Example usage:

scores = score_semantic(pred_label, truth_label)

cellmap_segmentation_challenge.utils.eval_utils.scoring.score_label(pred_label_path, label_name, crop_name, truth_path='/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/data/ground_truth.zarr', instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'])[source]#

Score a single label volume against the ground truth label volume.

Parameters:
  • pred_label_path (str) – The path to the predicted label volume.

  • truth_path (str) – The path to the ground truth label volume.

  • instance_classes (list) – A list of instance classes.

Returns:

A dictionary of scores for the label volume.

Return type:

dict

Example usage:

scores = score_label(‘pred.zarr/test_volume/label1’)

cellmap_segmentation_challenge.utils.eval_utils.scoring.empty_label_score(label, crop_name, instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'], truth_path='/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/data/ground_truth.zarr')[source]#
cellmap_segmentation_challenge.utils.eval_utils.scoring.match_crop_space(path, class_label, voxel_size, shape, translation) ndarray[source]#
Return type:

ndarray

cellmap_segmentation_challenge.utils.eval_utils.submission module#

Submission processing and main evaluation entry point.

cellmap_segmentation_challenge.utils.eval_utils.submission.ensure_zgroup(path: UPath) Group[source]#

Ensure that the given path can be opened as a zarr Group. If a .zgroup is not present, add it.

Parameters:

path (UPath)

Return type:

Group

cellmap_segmentation_challenge.utils.eval_utils.submission.get_evaluation_args(volumes, submission_path, truth_path='/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/data/ground_truth.zarr', instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance']) list[tuple][source]#

Get the arguments for scoring each label in the submission. :param volumes: A list of volumes to score. :type volumes: list :param submission_path: The path to the submission volume. :type submission_path: str :param truth_path: The path to the ground truth volume. :type truth_path: str :param instance_classes: A list of instance classes. :type instance_classes: list

Returns:

A list of tuples containing the arguments for each label to be scored.

Return type:

list[tuple]

cellmap_segmentation_challenge.utils.eval_utils.submission.missing_volume_score(truth_path, volume, instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance']) list[tuple][source]#

Score a missing volume as 0’s, congruent with the score_volume function.

Parameters:
  • truth_path (str) – The path to the ground truth volume.

  • volume (str) – The name of the volume.

  • instance_classes (list) – A list of instance classes.

Returns:

A dictionary of scores for the volume.

Return type:

dict

Example usage:

scores = missing_volume_score(‘truth.zarr/test_volume’)

cellmap_segmentation_challenge.utils.eval_utils.submission.ensure_valid_submission(submission_path: UPath)[source]#

Ensure that the unzipped submission path is a valid Zarr-2 file.

Parameters:

submission_path (str) – The path to the unzipped submission Zarr-2 file.

Raises:

ValueError – If the submission is not a valid unzipped Zarr-2 file.

cellmap_segmentation_challenge.utils.eval_utils.submission.score_submission(submission_path='/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/data/submission.zip', result_file=None, truth_path='/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/data/ground_truth.zarr', instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'], config: EvaluationConfig | None = None)[source]#

Score a submission against the ground truth data.

This is the main entry point for evaluating a submission. It unzips, validates, scores, and aggregates results for all volumes.

Parameters:
  • submission_path – Path to the zipped submission Zarr-2 file

  • result_file – Path to save the scores (None to skip saving)

  • truth_path – Path to the ground truth Zarr-2 file

  • instance_classes – List of instance segmentation classes

  • config (EvaluationConfig | None) – Evaluation configuration (uses defaults if None)

Returns:

Dictionary of aggregated scores across all volumes

Raises:
  • ValueError – If submission format is invalid

  • RuntimeError – If volume renaming fails

Example

>>> scores = score_submission('submission.zip', 'results.json')
>>> print(f"Overall score: {scores['overall_score']:.4f}")
Results structure:
{
“cropN”: { # Per-volume scores
“label_name”: {

# Instance segmentation “mean_accuracy”: float, “hausdorff_distance”: float, “combined_score”: float, # OR semantic segmentation “iou”: float, “dice_score”: float,

}

}, “label_scores”: { # Aggregated per-label

“label_name”: {…}

}, “overall_instance_score”: float, “overall_semantic_score”: float, “overall_score”: float,

}

cellmap_segmentation_challenge.utils.eval_utils.types module#

Type definitions for evaluation metrics.

class cellmap_segmentation_challenge.utils.eval_utils.types.InstanceScoreDict[source]#

Bases: TypedDict

Type definition for instance segmentation scores.

mean_accuracy: float#
binary_accuracy: float#
hausdorff_distance: float#
normalized_hausdorff_distance: float#
combined_score: float#
iou: float#
dice_score: float#
num_voxels: int#
voxel_size: tuple[float, ...]#
is_missing: bool#
status: Literal['scored', 'skipped_too_many_instances', 'missing']#
voi_split: float#
voi_merge: float#
class cellmap_segmentation_challenge.utils.eval_utils.types.SemanticScoreDict[source]#

Bases: TypedDict

Type definition for semantic segmentation scores.

iou: float#
dice_score: float#
binary_accuracy: float#
num_voxels: int#
voxel_size: tuple[float, ...]#
is_missing: bool#
status: Literal['scored', 'missing']#

cellmap_segmentation_challenge.utils.eval_utils.zip_utils module#

Zip file handling utilities for submissions.

cellmap_segmentation_challenge.utils.eval_utils.zip_utils.unzip_file(zip_path, max_uncompressed_size: int = 53687091200)[source]#

Unzip a zip file to a specified directory.

Validates against path traversal (zip slip), symlink attacks, and decompression bombs before extracting.

Parameters:
  • zip_path (str) – The path to the zip file.

  • max_uncompressed_size (int) – Maximum total uncompressed size in bytes.

Raises:

ValidationError – If any member fails security checks or total size exceeds limit.

Example usage:

unzip_file(‘submission.zip’)

Module contents#

Evaluation utilities for cellmap segmentation challenge.

This module provides all evaluation-related functionality including: - Scoring for instance and semantic segmentation - Instance matching using min-cost flow optimization - Hausdorff distance and other metrics - Submission processing and validation

class cellmap_segmentation_challenge.utils.eval_utils.InstanceScoreDict[source]#

Bases: TypedDict

Type definition for instance segmentation scores.

mean_accuracy: float#
binary_accuracy: float#
hausdorff_distance: float#
normalized_hausdorff_distance: float#
combined_score: float#
iou: float#
dice_score: float#
num_voxels: int#
voxel_size: tuple[float, ...]#
is_missing: bool#
status: Literal['scored', 'skipped_too_many_instances', 'missing']#
voi_split: float#
voi_merge: float#
class cellmap_segmentation_challenge.utils.eval_utils.SemanticScoreDict[source]#

Bases: TypedDict

Type definition for semantic segmentation scores.

iou: float#
dice_score: float#
binary_accuracy: float#
num_voxels: int#
voxel_size: tuple[float, ...]#
is_missing: bool#
status: Literal['scored', 'missing']#
exception cellmap_segmentation_challenge.utils.eval_utils.EvaluationError[source]#

Bases: Exception

Base exception for evaluation errors.

exception cellmap_segmentation_challenge.utils.eval_utils.TooManyInstancesError(n_pred: int, n_gt: int, ratio: float, cutoff: float)[source]#

Bases: EvaluationError

Raised when submission has too many instances relative to ground truth.

This is a pathological case where the ratio of predicted to ground truth instances exceeds acceptable thresholds, likely indicating poor segmentation.

Parameters:
  • n_pred (int)

  • n_gt (int)

  • ratio (float)

  • cutoff (float)

exception cellmap_segmentation_challenge.utils.eval_utils.TooManyOverlapEdgesError(n_edges: int, max_edges: int)[source]#

Bases: EvaluationError

Raised when instance matching produces too many overlap edges.

This indicates computational infeasibility for the matching algorithm.

Parameters:
  • n_edges (int)

  • max_edges (int)

exception cellmap_segmentation_challenge.utils.eval_utils.MatchingFailedError(status: int)[source]#

Bases: EvaluationError

Raised when instance matching optimization fails.

Parameters:

status (int)

exception cellmap_segmentation_challenge.utils.eval_utils.ValidationError[source]#

Bases: EvaluationError

Raised when input validation fails.

class cellmap_segmentation_challenge.utils.eval_utils.EvaluationConfig(max_instance_threads: int = 3, max_semantic_threads: int = 25, per_instance_threads: int = 25, max_distance_cap_eps: float = 0.0001, final_instance_ratio_cutoff: float = 10.0, initial_instance_ratio_cutoff: float = 50.0, instance_ratio_factor: float = 5.0, max_overlap_edges: int = 5000000, mcmf_cost_scale: int = 1000000, truth_path: UPath = <factory>, instance_classes: list[str] = <factory>, cast_to_none: list[Any] = <factory>)[source]#

Bases: object

Configuration for evaluation pipeline.

All parameters can be set via environment variables or passed directly. Environment variables take precedence over defaults but not over explicitly passed values.

Parameters:
  • max_instance_threads (int)

  • max_semantic_threads (int)

  • per_instance_threads (int)

  • max_distance_cap_eps (float)

  • final_instance_ratio_cutoff (float)

  • initial_instance_ratio_cutoff (float)

  • instance_ratio_factor (float)

  • max_overlap_edges (int)

  • mcmf_cost_scale (int)

  • truth_path (UPath)

  • instance_classes (list[str])

  • cast_to_none (list[Any])

max_instance_threads: int = 3#
max_semantic_threads: int = 25#
per_instance_threads: int = 25#
max_distance_cap_eps: float = 0.0001#
final_instance_ratio_cutoff: float = 10.0#
initial_instance_ratio_cutoff: float = 50.0#
instance_ratio_factor: float = 5.0#
max_overlap_edges: int = 5000000#
mcmf_cost_scale: int = 1000000#
truth_path: UPath#
instance_classes: list[str]#
cast_to_none: list[Any]#
classmethod from_env() EvaluationConfig[source]#

Load configuration from environment variables with defaults.

Returns:

EvaluationConfig with values from environment or defaults.

Return type:

EvaluationConfig

validate() None[source]#

Validate configuration values.

Raises:

ValueError – If any configuration value is invalid.

Return type:

None

cellmap_segmentation_challenge.utils.eval_utils.ratio_cutoff(nG: int, R_base: float = 10.0, R_extra: float = 50.0, k: float = 5.0) float[source]#

Calculate the acceptable ratio cutoff for instance matching.

The ratio cutoff decreases exponentially as the number of ground truth instances increases, allowing for more tolerance with fewer instances.

Parameters:
  • nG (int) – Number of ground truth instances

  • R_base (float) – Base ratio cutoff (minimum)

  • R_extra (float) – Extra ratio tolerance for small nG

  • k (float) – Exponential decay factor

Returns:

Maximum acceptable ratio of predicted to ground truth instances

Return type:

float

class cellmap_segmentation_challenge.utils.eval_utils.InstanceOverlapData(nG: int, nP: int, rows: ndarray, cols: ndarray, iou_vals: ndarray)[source]#

Bases: object

Data structure for instance overlap computation.

Parameters:
  • nG (int)

  • nP (int)

  • rows (ndarray)

  • cols (ndarray)

  • iou_vals (ndarray)

nG: int#
nP: int#
rows: ndarray#
cols: ndarray#
iou_vals: ndarray#
cellmap_segmentation_challenge.utils.eval_utils.match_instances(gt: ndarray, pred: ndarray, config: EvaluationConfig | None = None) dict[int, int][source]#

Match instances between GT and prediction based on IoU.

Uses min-cost flow optimization to find optimal 1:1 matching between predicted and ground truth instances based on IoU overlap.

Parameters:
  • gt (ndarray) – Ground truth instance labels (0 = background)

  • pred (ndarray) – Predicted instance labels (0 = background)

  • config (EvaluationConfig | None) – Evaluation configuration (uses defaults if None)

Returns:

Dictionary mapping predicted instance ID to ground truth instance ID. Returns {0: 0} if only background present. Returns {} if no matches found or one side has no instances.

Raises:
Return type:

dict[int, int]

Example

>>> mapping = match_instances(gt, pred)
>>> # Remap predictions to match GT IDs
>>> pred_aligned = remap(pred, mapping, preserve_missing_labels=True)
cellmap_segmentation_challenge.utils.eval_utils.compute_max_distance(voxel_size, shape) float[source]#

Compute the maximum distance used for distance-based metrics, based on the maximum distance to a volume boundary.

Return type:

float

cellmap_segmentation_challenge.utils.eval_utils.normalize_distance(distance: float, voxel_size) float[source]#

Normalize a distance value to [0, 1] using the maximum distance represented by a voxel

Parameters:

distance (float)

Return type:

float

cellmap_segmentation_challenge.utils.eval_utils.optimized_hausdorff_distances(truth_label, pred_label, voxel_size, hausdorff_distance_max, method='standard', percentile: float | None = None)[source]#

Compute per-truth-instance Hausdorff-like distances against the (already remapped) prediction using multithreading. Returns a 1D float32 numpy array whose i-th entry corresponds to truth_ids[i].

Parameters:
  • truth_label (np.ndarray) – Ground-truth instance label volume (0 == background).

  • pred_label (np.ndarray) – Prediction instance label volume that has already been remapped to align with the GT ids (0 == background).

  • voxel_size (Sequence[float]) – Physical voxel sizes in Z, Y, X (or Y, X) order.

  • hausdorff_distance_max (float) – Cap for distances (use np.inf for uncapped).

  • method ({"standard", "modified", "percentile"}) –

    “standard” -> classic Hausdorff (max of directed maxima) “modified” -> mean of directed distances, then max of the two means “percentile” -> use the given percentile of directed distances (requires

    percentile to be provided).

  • percentile (float | None) – Percentile (0-100) used when method==”percentile”.

cellmap_segmentation_challenge.utils.eval_utils.bbox_for_label(stats: StatisticsDict | StatisticsSlicesDict, ndim: int, label_id: int)[source]#

Try to get bbox without allocating a full boolean mask using cc3d statistics. Falls back to mask-based bbox if cc3d doesn’t provide expected fields. Returns (mins, maxs) inclusive-exclusive in voxel indices, or None if missing.

Parameters:
  • stats (StatisticsDict | StatisticsSlicesDict)

  • ndim (int)

  • label_id (int)

cellmap_segmentation_challenge.utils.eval_utils.roi_slices_for_pair(truth_stats: StatisticsDict | StatisticsSlicesDict, pred_stats: StatisticsDict | StatisticsSlicesDict, tid: int, voxel_size, ndim: int, shape: tuple[int, ...], max_distance: float)[source]#

ROI = union(bbox(truth==tid), bbox(pred==tid)) padded by P derived from max_distance. Returns tuple of slices suitable for numpy indexing.

Parameters:
  • truth_stats (StatisticsDict | StatisticsSlicesDict)

  • pred_stats (StatisticsDict | StatisticsSlicesDict)

  • tid (int)

  • ndim (int)

  • shape (tuple[int, ...])

  • max_distance (float)

cellmap_segmentation_challenge.utils.eval_utils.compute_hausdorff_distance_roi(truth_label: ndarray, truth_stats: StatisticsDict | StatisticsSlicesDict, pred_label: ndarray, pred_stats: StatisticsDict | StatisticsSlicesDict, tid: int, voxel_size, max_distance: float, method: str = 'standard', percentile: float | None = None)[source]#

Same metric as compute_hausdorff_distance(), but operates on an ROI slice and builds masks only inside ROI.

Parameters:
  • truth_label (ndarray)

  • truth_stats (StatisticsDict | StatisticsSlicesDict)

  • pred_label (ndarray)

  • pred_stats (StatisticsDict | StatisticsSlicesDict)

  • tid (int)

  • max_distance (float)

  • method (str)

  • percentile (float | None)

cellmap_segmentation_challenge.utils.eval_utils.score_instance(pred_label, truth_label, voxel_size, hausdorff_distance_max=None, config: EvaluationConfig | None = None) InstanceScoreDict[source]#

Score instance segmentation against ground truth.

Computes pixel-wise accuracy, Hausdorff distance, and combined metrics after optimal instance matching.

Parameters:
  • pred_label – Predicted instance labels (0 = background)

  • truth_label – Ground truth instance labels (0 = background)

  • voxel_size – Physical voxel size in (Z, Y, X) order

  • hausdorff_distance_max – Maximum Hausdorff distance cap (None = auto)

  • config (EvaluationConfig | None) – Evaluation configuration (uses defaults if None)

Returns:

Dictionary containing all instance segmentation metrics

Return type:

InstanceScoreDict

Example

>>> scores = score_instance(pred, truth, voxel_size=(4.0, 4.0, 4.0))
>>> print(f"Combined score: {scores['combined_score']:.3f}")
cellmap_segmentation_challenge.utils.eval_utils.score_semantic(pred_label, truth_label) dict[str, float][source]#

Score a single semantic label volume against the ground truth semantic label volume.

Parameters:
  • pred_label (np.ndarray) – The predicted semantic label volume.

  • truth_label (np.ndarray) – The ground truth semantic label volume.

Returns:

A dictionary of scores for the semantic label volume.

Return type:

dict

Example usage:

scores = score_semantic(pred_label, truth_label)

cellmap_segmentation_challenge.utils.eval_utils.score_label(pred_label_path, label_name, crop_name, truth_path='/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/data/ground_truth.zarr', instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'])[source]#

Score a single label volume against the ground truth label volume.

Parameters:
  • pred_label_path (str) – The path to the predicted label volume.

  • truth_path (str) – The path to the ground truth label volume.

  • instance_classes (list) – A list of instance classes.

Returns:

A dictionary of scores for the label volume.

Return type:

dict

Example usage:

scores = score_label(‘pred.zarr/test_volume/label1’)

cellmap_segmentation_challenge.utils.eval_utils.empty_label_score(label, crop_name, instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'], truth_path='/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/data/ground_truth.zarr')[source]#
cellmap_segmentation_challenge.utils.eval_utils.match_crop_space(path, class_label, voxel_size, shape, translation) ndarray[source]#
Return type:

ndarray

cellmap_segmentation_challenge.utils.eval_utils.combine_scores(scores, include_missing=True, instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'], cast_to_none=[nan, inf, -inf, inf, -inf])[source]#

Combine scores across volumes, normalizing by the number of voxels.

Parameters:
  • scores (dict) – A dictionary of scores for each volume, as returned by score_volume.

  • include_missing (bool) – Whether to include missing volumes in the combined scores.

  • instance_classes (list) – A list of instance classes.

  • cast_to_none (list) – A list of values to cast to None in the combined scores.

Returns:

A dictionary of combined scores across all volumes.

Return type:

dict

Example usage:

combined_scores = combine_scores(scores)

cellmap_segmentation_challenge.utils.eval_utils.sanitize_scores(scores)[source]#

Sanitize scores by converting NaN values to None.

Parameters:

scores (dict) – A dictionary of scores.

Returns:

A sanitized dictionary of scores.

Return type:

dict

cellmap_segmentation_challenge.utils.eval_utils.update_scores(scores, results, result_file, instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'])[source]#
cellmap_segmentation_challenge.utils.eval_utils.num_evals_done(all_scores)[source]#
cellmap_segmentation_challenge.utils.eval_utils.ensure_zgroup(path: UPath) Group[source]#

Ensure that the given path can be opened as a zarr Group. If a .zgroup is not present, add it.

Parameters:

path (UPath)

Return type:

Group

cellmap_segmentation_challenge.utils.eval_utils.ensure_valid_submission(submission_path: UPath)[source]#

Ensure that the unzipped submission path is a valid Zarr-2 file.

Parameters:

submission_path (str) – The path to the unzipped submission Zarr-2 file.

Raises:

ValueError – If the submission is not a valid unzipped Zarr-2 file.

cellmap_segmentation_challenge.utils.eval_utils.get_evaluation_args(volumes, submission_path, truth_path='/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/data/ground_truth.zarr', instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance']) list[tuple][source]#

Get the arguments for scoring each label in the submission. :param volumes: A list of volumes to score. :type volumes: list :param submission_path: The path to the submission volume. :type submission_path: str :param truth_path: The path to the ground truth volume. :type truth_path: str :param instance_classes: A list of instance classes. :type instance_classes: list

Returns:

A list of tuples containing the arguments for each label to be scored.

Return type:

list[tuple]

cellmap_segmentation_challenge.utils.eval_utils.missing_volume_score(truth_path, volume, instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance']) list[tuple][source]#

Score a missing volume as 0’s, congruent with the score_volume function.

Parameters:
  • truth_path (str) – The path to the ground truth volume.

  • volume (str) – The name of the volume.

  • instance_classes (list) – A list of instance classes.

Returns:

A dictionary of scores for the volume.

Return type:

dict

Example usage:

scores = missing_volume_score(‘truth.zarr/test_volume’)

cellmap_segmentation_challenge.utils.eval_utils.score_submission(submission_path='/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/data/submission.zip', result_file=None, truth_path='/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/data/ground_truth.zarr', instance_classes=['nuc', 'vim', 'ves', 'endo', 'lyso', 'ld', 'perox', 'mito', 'np', 'mt', 'cell', 'instance'], config: EvaluationConfig | None = None)[source]#

Score a submission against the ground truth data.

This is the main entry point for evaluating a submission. It unzips, validates, scores, and aggregates results for all volumes.

Parameters:
  • submission_path – Path to the zipped submission Zarr-2 file

  • result_file – Path to save the scores (None to skip saving)

  • truth_path – Path to the ground truth Zarr-2 file

  • instance_classes – List of instance segmentation classes

  • config (EvaluationConfig | None) – Evaluation configuration (uses defaults if None)

Returns:

Dictionary of aggregated scores across all volumes

Raises:
  • ValueError – If submission format is invalid

  • RuntimeError – If volume renaming fails

Example

>>> scores = score_submission('submission.zip', 'results.json')
>>> print(f"Overall score: {scores['overall_score']:.4f}")
Results structure:
{
“cropN”: { # Per-volume scores
“label_name”: {

# Instance segmentation “mean_accuracy”: float, “hausdorff_distance”: float, “combined_score”: float, # OR semantic segmentation “iou”: float, “dice_score”: float,

}

}, “label_scores”: { # Aggregated per-label

“label_name”: {…}

}, “overall_instance_score”: float, “overall_semantic_score”: float, “overall_score”: float,

}

cellmap_segmentation_challenge.utils.eval_utils.resize_array(arr, target_shape, pad_value=0)[source]#

Resize an array to a target shape by padding or cropping as needed.

Parameters:
  • arr (np.ndarray) – Input array to resize.

  • target_shape (tuple) – Desired shape for the output array.

  • pad_value (int, float, etc.) – Value to use for padding if the array is smaller than the target shape.

Returns:

Resized array with the specified target shape.

Return type:

np.ndarray

cellmap_segmentation_challenge.utils.eval_utils.unzip_file(zip_path, max_uncompressed_size: int = 53687091200)[source]#

Unzip a zip file to a specified directory.

Validates against path traversal (zip slip), symlink attacks, and decompression bombs before extracting.

Parameters:
  • zip_path (str) – The path to the zip file.

  • max_uncompressed_size (int) – Maximum total uncompressed size in bytes.

Raises:

ValidationError – If any member fails security checks or total size exceeds limit.

Example usage:

unzip_file(‘submission.zip’)