cellmap_segmentation_challenge.utils package

Contents

cellmap_segmentation_challenge.utils package#

Submodules#

cellmap_segmentation_challenge.utils.crops module#

cellmap_segmentation_challenge.utils.crops.fetch_manifest(url: str | URL, file_name: str, object: Self) tuple[str, ...][source]#
Parameters:
  • url (str | URL)

  • file_name (str)

  • object (Self)

Return type:

tuple[str, …]

class cellmap_segmentation_challenge.utils.crops.TestCropRow(id: int, dataset: str, class_label: str, voxel_size: tuple[float, ...], translation: tuple[float, ...], shape: tuple[int, ...])[source]#

Bases: object

A dataclass representing a row in the test crop manifest file.

Parameters:
  • id (int)

  • dataset (str)

  • class_label (str)

  • voxel_size (tuple[float, ...])

  • translation (tuple[float, ...])

  • shape (tuple[int, ...])

id: int#
dataset: str#
class_label: str#
voxel_size: tuple[float, ...]#
translation: tuple[float, ...]#
shape: tuple[int, ...]#
classmethod from_csv_row(row: str) Self[source]#

Create a CropRow object from a CSV row.

Parameters:

row (str)

Return type:

Self

cellmap_segmentation_challenge.utils.crops.fetch_test_crop_manifest(url: str | URL = 'https://raw.githubusercontent.com/janelia-cellmap/cellmap-segmentation-challenge/refs/heads/main/src/cellmap_segmentation_challenge/utils/test_crop_manifest.csv') tuple[TestCropRow, ...][source]#

Fetch a test manifest file from a URL and return a tuple of TestCropRow objects.

Parameters:

url (str or yarl.URL) – The URL to the manifest file.

Returns:

A tuple of TestCropRow objects.

Return type:

tuple[TestCropRow, …]

class cellmap_segmentation_challenge.utils.crops.ZipDatasetRow(all_res: bool, padding: int, name: str, url: URL)[source]#

Bases: object

A dataclass representing a row in the zip dataset manifest file.

Parameters:
  • all_res (bool)

  • padding (int)

  • name (str)

  • url (URL)

all_res: bool#
padding: int#
name: str#
url: URL#
classmethod from_csv_row(row: str) Self[source]#

Create a CropRow object from a CSV row.

Parameters:

row (str)

Return type:

Self

cellmap_segmentation_challenge.utils.crops.fetch_zip_manifest(url: str | URL = 'https://raw.githubusercontent.com/janelia-cellmap/cellmap-segmentation-challenge/refs/heads/main/src/cellmap_segmentation_challenge/utils/zip_manifest.csv') tuple[ZipDatasetRow, ...][source]#

Fetch a manifest file from a URL and return a tuple of ZipDatasetRow objects.

Parameters:

url (str or yarl.URL) – The URL to the manifest file.

Returns:

A tuple of ZipDatasetRow objects.

Return type:

tuple[ZipDatasetRow, …]

class cellmap_segmentation_challenge.utils.crops.CropRow(id: int, dataset: str, alignment: str, gt_source: URL | TestCropRow, em_url: URL)[source]#

Bases: object

A dataclass representing a row in the crop manifest file.

Parameters:
  • id (int)

  • dataset (str)

  • alignment (str)

  • gt_source (URL | TestCropRow)

  • em_url (URL)

id: int#
dataset: str#
alignment: str#
gt_source: URL | TestCropRow#
em_url: URL#
classmethod from_csv_row(row: str) Self[source]#

Create a CropRow object from a CSV row.

Parameters:

row (str)

Return type:

Self

cellmap_segmentation_challenge.utils.crops.fetch_crop_manifest(url: str | URL = 'https://raw.githubusercontent.com/janelia-cellmap/cellmap-segmentation-challenge/refs/heads/main/src/cellmap_segmentation_challenge/utils/manifest.csv') tuple[CropRow, ...][source]#

Fetch a manifest file from a URL and return a tuple of CropRow objects.

Parameters:

url (str or yarl.URL) – The URL to the manifest file.

Returns:

A tuple of CropRow objects.

Return type:

tuple[CropRow, …]

cellmap_segmentation_challenge.utils.crops.get_test_crops() tuple[CropRow, ...][source]#
Return type:

tuple[CropRow, …]

cellmap_segmentation_challenge.utils.dataloader module#

cellmap_segmentation_challenge.utils.dataloader.get_dataloader(datasplit_path: str, classes: Sequence[str], batch_size: int, array_info: Mapping[str, Sequence[int | float]] | None = None, input_array_info: Mapping[str, Sequence[int | float]] | None = None, target_array_info: Mapping[str, Sequence[int | float]] | None = None, spatial_transforms: Mapping[str, Any] | None = None, iterations_per_epoch: int = 1000, random_validation: bool = False, device: str | device | None = None) tuple[CellMapDataLoader, CellMapDataLoader][source]#

Get the train and validation dataloaders.

This function gets the train and validation dataloaders for the given datasplit file, classes, batch size, array info, spatial transforms, iterations per epoch, number of workers, and device.

Parameters:
  • datasplit_path (str) – Path to the datasplit file that defines the train/val split the dataloader should use.

  • classes (Sequence[str]) – List of classes to segment.

  • batch_size (int) – Batch size for the dataloader.

  • array_info (Optional[Mapping[str, Sequence[int | float]]]) – Dictionary containing the shape and scale of the data to load for the input and target. Either array_info or input_array_info & target_array_info must be provided.

  • input_array_info (Optional[Mapping[str, Sequence[int | float]]]) – Dictionary containing the shape and scale of the data to load for the input.

  • target_array_info (Optional[Mapping[str, Sequence[int | float]]]) – Dictionary containing the shape and scale of the data to load for the target.

  • spatial_transforms (Optional[Mapping[str, any]]) – Dictionary containing the spatial transformations to apply to the data. For example the dictionary could contain transformations like mirror, transpose, and rotate.

  • { (spatial_transforms =) –

    # 3D

    # Probability of applying mirror for each axis # Values range from 0 (no mirroring) to 1 (will always mirror)

    ”mirror”: {“axes”: {“x”: 0.5, “y”: 0.5, “z”: 0.5}},

    # Specifies the axes that will be invovled in the trasposition

    ”transpose”: {“axes”: [“x”, “y”, “z”]},

    # Defines rotation range for each axis. # Rotation angle for each axis is randomly chosen within the specified range (-180, 180).

    ”rotate”: {“axes”: {“x”: [-180, 180], “y”: [-180, 180], “z”: [-180, 180]}},

    # 2D (used when there is no z axis) # “mirror”: {“axes”: {“x”: 0.5, “y”: 0.5}}, # “transpose”: {“axes”: [“x”, “y”]}, # “rotate”: {“axes”: {“x”: [-180, 180], “y”: [-180, 180]}},

  • }

  • iterations_per_epoch (int)

  • random_validation (bool)

  • device (str | device | None)

Return type:

tuple[CellMapDataLoader, CellMapDataLoader]

iterations_per_epochint

Number of iterations per epoch.

random_validationbool

Whether or not to randomize the validation data draws. Useful if not evaluating on the entire validation set everytime. Defaults to False.

deviceOptional[str or torch.device]

Device to use for training. If None, defaults to “cuda” if available, or “mps” if available, or “cpu”.

Returns:

Tuple containing the train and validation dataloaders.

Return type:

tuple[torch.utils.data.DataLoader, torch.utils.data.DataLoader]

Parameters:
  • datasplit_path (str)

  • classes (Sequence[str])

  • batch_size (int)

  • array_info (Mapping[str, Sequence[int | float]] | None)

  • input_array_info (Mapping[str, Sequence[int | float]] | None)

  • target_array_info (Mapping[str, Sequence[int | float]] | None)

  • spatial_transforms (Mapping[str, Any] | None)

  • iterations_per_epoch (int)

  • random_validation (bool)

  • device (str | device | None)

cellmap_segmentation_challenge.utils.datasplit module#

cellmap_segmentation_challenge.utils.datasplit.get_dataset_name(raw_path: str, search_path: str = '/opt/hostedtoolcache/Python/3.12.8/x64/lib/python3.12/data/{dataset}/{dataset}.zarr/recon-1/{name}', raw_name: str = 'em/fibsem-uint8') str[source]#

Get the name of the dataset from the raw path.

Parameters:
  • raw_path (str)

  • search_path (str)

  • raw_name (str)

Return type:

str

cellmap_segmentation_challenge.utils.datasplit.get_raw_path(crop_path: str, raw_name: str = 'em/fibsem-uint8', label: str = '') str[source]#

Get the path to the raw data for a given crop path.

Parameters:
  • crop_path (str) – The path to the crop.

  • raw_name (str, optional) – The name of the raw data, by default RAW_NAME

  • label (str, optional) – The label class at the crop_path, by default “”

Returns:

The path to the raw data.

Return type:

str

cellmap_segmentation_challenge.utils.datasplit.get_formatted_fields(path: str, base_path: str, fields: list[str]) dict[str, str][source]#

Get the formatted fields from the path.

Parameters:
  • path (str) – The path to get the fields from.

  • base_path (str) – The unformatted path to find the fields in.

  • fields (list[str]) – The fields to get from the path.

Returns:

The formatted fields.

Return type:

dict[str, str]

cellmap_segmentation_challenge.utils.datasplit.get_s3_csv_string(path: str, classes: list[str], usage: str)[source]#

Get the csv string for a given dataset path, to be written to the datasplit csv file.

Parameters:
  • path (str) – The path to the dataset.

  • classes (list[str]) – The classes present in the dataset.

  • usage (str) – The usage of the dataset (train or validate).

Returns:

The csv string for the dataset.

Return type:

str

cellmap_segmentation_challenge.utils.datasplit.get_csv_string(path: str, classes: list[str], usage: str, raw_name: str = 'em/fibsem-uint8', search_path: str = '/opt/hostedtoolcache/Python/3.12.8/x64/lib/python3.12/data/{dataset}/{dataset}.zarr/recon-1/{name}')[source]#

Get the csv string for a given dataset path, to be written to the datasplit csv file.

Parameters:
  • path (str) – The path to the dataset.

  • classes (list[str]) – The classes present in the dataset.

  • usage (str) – The usage of the dataset (train or validate).

  • raw_name (str, optional) – The name of the raw data. Default is RAW_NAME.

  • search_path (str, optional) – The search path to use to find the datasets. Default is SEARCH_PATH.

Returns:

The csv string for the dataset.

Return type:

str

cellmap_segmentation_challenge.utils.datasplit.make_s3_datasplit_csv(classes: list[str] = ['nuc', 'mito'], force_all_classes: bool | str = False, validation_prob: float = 0.1, datasets: list[str] = ['*'], crops: list[str] = ['*'], csv_path: str = 'datasplit.csv', dry_run: bool = False, **kwargs)[source]#

Make a datasplit csv file for the given classes and datasets.

Parameters:
  • classes (list[str], optional) – The classes to include in the csv, by default [“nuc”, “mito”]

  • force_all_classes (bool | str, optional) – If True, force all classes to be present in the training/validation datasets. If False, as long as at least one requested class is present, a crop will be included. If “train” or “validate”, force all classes to be present in the training or validation datasets, respectively. By default False.

  • validation_prob (float, optional) – The probability of a dataset being in the validation set, by default 0.1

  • datasets (list[str], optional) – The datasets to include in the csv, by default [“*”], which includes all datasets

  • crops (list[str], optional) – The crops to include in the csv, by default all crops are included. Otherwise, only the crops in the list are included.

  • csv_path (str, optional) – The path to write the csv file to, by default “datasplit.csv”

  • dry_run (bool, optional) – If True, do not write the csv file - just return the found datapaths. By default False

  • **kwargs (dict) – Additional keyword arguments will be unused. Kept for compatibility with make_datasplit_csv.

cellmap_segmentation_challenge.utils.datasplit.make_datasplit_csv(classes: list[str] = ['nuc', 'mito'], force_all_classes: bool | str = False, validation_prob: float = 0.1, datasets: list[str] = ['*'], crops: list[str] = ['*'], search_path: str = '/opt/hostedtoolcache/Python/3.12.8/x64/lib/python3.12/data/{dataset}/{dataset}.zarr/recon-1/{name}', raw_name: str = 'em/fibsem-uint8', crop_name: str = 'labels/groundtruth/{crop}/{label}', csv_path: str = 'datasplit.csv', dry_run: bool = False)[source]#

Make a datasplit csv file for the given classes and datasets.

Parameters:
  • classes (list[str], optional) – The classes to include in the csv, by default [“nuc”, “mito”]

  • force_all_classes (bool | str, optional) – If True, force all classes to be present in the training/validation datasets. If False, as long as at least one requested class is present, a crop will be included. If “train” or “validate”, force all classes to be present in the training or validation datasets, respectively. By default False.

  • validation_prob (float, optional) – The probability of a dataset being in the validation set, by default 0.1

  • datasets (list[str], optional) – The datasets to include in the csv, by default [“*”], which includes all datasets

  • crops (list[str], optional) – The crops to include in the csv, by default all crops are included. Otherwise, only the crops in the list are included.

  • search_path (str, optional) – The search path to use to find the datasets, by default SEARCH_PATH

  • raw_name (str, optional) – The name of the raw data, by default RAW_NAME

  • crop_name (str, optional) – The name of the crop, by default CROP_NAME

  • csv_path (str, optional) – The path to write the csv file to, by default “datasplit.csv”

  • dry_run (bool, optional) – If True, do not write the csv file - just return the found datapaths. By default False

cellmap_segmentation_challenge.utils.datasplit.get_dataset_counts(classes: list[str] = ['nuc', 'mito'], search_path: str = '/opt/hostedtoolcache/Python/3.12.8/x64/lib/python3.12/data/{dataset}/{dataset}.zarr/recon-1/{name}', raw_name: str = 'em/fibsem-uint8', crop_name: str = 'labels/groundtruth/{crop}/{label}')[source]#

Get the counts of each class in each dataset.

Parameters:
  • classes (list[str], optional) – The classes to include in the csv, by default [“nuc”, “mito”]

  • search_path (str, optional) – The search path to use to find the datasets, by default SEARCH_PATH

  • raw_name (str, optional) – The name of the raw data, by default RAW_NAME

  • crop_name (str, optional) – The name of the crop, by default CROP_NAME

Returns:

A dictionary of the counts of each class in each dataset.

Return type:

dict

cellmap_segmentation_challenge.utils.datasplit.get_tested_classes(csv_path: str = '/opt/hostedtoolcache/Python/3.12.8/x64/lib/python3.12/site-packages/cellmap_segmentation_challenge/utils/tested_classes.csv')[source]#

Get the classes that will be tested for the challenge.

Parameters:

csv_path (str, optional) – The path to the csv file, by default “tested_classes.csv”

Returns:

A list of the classes that have been tested.

Return type:

list[str]

cellmap_segmentation_challenge.utils.datasplit.get_class_incl_ids(incl_ids_string)[source]#
cellmap_segmentation_challenge.utils.datasplit.get_class_relations(csv_path: str = '/opt/hostedtoolcache/Python/3.12.8/x64/lib/python3.12/site-packages/cellmap_segmentation_challenge/utils/classes.csv', named_classes: list[str] | None = None)[source]#
Parameters:
  • csv_path (str)

  • named_classes (list[str] | None)

cellmap_segmentation_challenge.utils.fetch_data module#

cellmap_segmentation_challenge.utils.fetch_data.copy_store(*, keys: Iterable[str], source_store: Store, dest_store: Store)[source]#

Iterate over the keys, copying them from the source store to the dest store

Parameters:
  • keys (Iterable[str])

  • source_store (Store)

  • dest_store (Store)

cellmap_segmentation_challenge.utils.fetch_data.partition_copy_store(*, keys, source_store, dest_store, batch_size, pool: ThreadPoolExecutor)[source]#
Parameters:

pool (ThreadPoolExecutor)

cellmap_segmentation_challenge.utils.fetch_data.get_store_url(store: BaseStore, path: str)[source]#
Parameters:
  • store (BaseStore)

  • path (str)

cellmap_segmentation_challenge.utils.fetch_data.get_chunk_keys(array: Array, region: tuple[slice, ...] = ()) Generator[str, None, None][source]#

Get the keys for all the chunks in a Zarr array as a generator of strings. Returns keys relative to the path of the array.

copied with modifications from janelia-cellmap/fibsem-tools

Parameters:
  • array (zarr.core.Array) – The zarr array to get the chunk keys from

  • region (tuple[slice, ...]) – The region in the zarr array get chunks keys from. Defaults to (), which will result in all the chunk keys being returned.

Return type:

Generator[str, None, None]

cellmap_segmentation_challenge.utils.fetch_data.read_group(path: str, **kwargs) Group[source]#
Parameters:

path (str)

Return type:

Group

cellmap_segmentation_challenge.utils.fetch_data.subset_to_slice(outer_array, inner_array) tuple[slice, ...][source]#
Return type:

tuple[slice, …]

cellmap_segmentation_challenge.utils.fetch_data.resolve_em_url(em_source_root: URL, em_source_paths: list[str])[source]#
Parameters:
  • em_source_root (URL)

  • em_source_paths (list[str])

cellmap_segmentation_challenge.utils.fetch_data.parse_s3_url(s3_url: str) -> (<class 'str'>, <class 'str'>)[source]#
Parameters:

s3_url (str)

Return type:

(<class ‘str’>, <class ‘str’>)

cellmap_segmentation_challenge.utils.fetch_data.download_file_with_progress(s3_url, local_filename)[source]#
cellmap_segmentation_challenge.utils.fetch_data.get_zip_if_available(crops, raw_padding, fetch_all_em_resolutions, zips_from_manifest)[source]#

cellmap_segmentation_challenge.utils.loss module#

class cellmap_segmentation_challenge.utils.loss.CellMapLossWrapper(loss_fn: _Loss | _WeightedLoss, **kwargs)[source]#

Bases: _Loss

Initialize internal Module state, shared by both nn.Module and ScriptModule.

Parameters:

loss_fn (_Loss | _WeightedLoss)

forward(outputs: Tensor, target: Tensor)[source]#

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Parameters:
  • outputs (Tensor)

  • target (Tensor)

cellmap_segmentation_challenge.utils.security module#

cellmap_segmentation_challenge.utils.security.analyze_script(filepath)[source]#

Analyzes the script at filepath using ast for potentially unsafe imports and function calls. Returns a boolean indicating whether the script is safe and a list of detected issues.

cellmap_segmentation_challenge.utils.security.load_safe_config(config_path, force_safe=True)[source]#

Loads the configuration script at config_path after verifying its safety. If force_safe is True, raises an error if the script is deemed unsafe.

class cellmap_segmentation_challenge.utils.security.Config(**kwargs)[source]#

Bases: object

Module contents#

class cellmap_segmentation_challenge.utils.CellMapLossWrapper(loss_fn: _Loss | _WeightedLoss, **kwargs)[source]#

Bases: _Loss

Initialize internal Module state, shared by both nn.Module and ScriptModule.

Parameters:

loss_fn (_Loss | _WeightedLoss)

forward(outputs: Tensor, target: Tensor)[source]#

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Parameters:
  • outputs (Tensor)

  • target (Tensor)

class cellmap_segmentation_challenge.utils.TestCropRow(id: int, dataset: str, class_label: str, voxel_size: tuple[float, ...], translation: tuple[float, ...], shape: tuple[int, ...])[source]#

Bases: object

A dataclass representing a row in the test crop manifest file.

Parameters:
  • id (int)

  • dataset (str)

  • class_label (str)

  • voxel_size (tuple[float, ...])

  • translation (tuple[float, ...])

  • shape (tuple[int, ...])

id: int#
dataset: str#
class_label: str#
voxel_size: tuple[float, ...]#
translation: tuple[float, ...]#
shape: tuple[int, ...]#
classmethod from_csv_row(row: str) Self[source]#

Create a CropRow object from a CSV row.

Parameters:

row (str)

Return type:

Self

cellmap_segmentation_challenge.utils.analyze_script(filepath)[source]#

Analyzes the script at filepath using ast for potentially unsafe imports and function calls. Returns a boolean indicating whether the script is safe and a list of detected issues.

cellmap_segmentation_challenge.utils.fetch_crop_manifest(url: str | URL = 'https://raw.githubusercontent.com/janelia-cellmap/cellmap-segmentation-challenge/refs/heads/main/src/cellmap_segmentation_challenge/utils/manifest.csv') tuple[CropRow, ...][source]#

Fetch a manifest file from a URL and return a tuple of CropRow objects.

Parameters:

url (str or yarl.URL) – The URL to the manifest file.

Returns:

A tuple of CropRow objects.

Return type:

tuple[CropRow, …]

cellmap_segmentation_challenge.utils.fetch_test_crop_manifest(url: str | URL = 'https://raw.githubusercontent.com/janelia-cellmap/cellmap-segmentation-challenge/refs/heads/main/src/cellmap_segmentation_challenge/utils/test_crop_manifest.csv') tuple[TestCropRow, ...][source]#

Fetch a test manifest file from a URL and return a tuple of TestCropRow objects.

Parameters:

url (str or yarl.URL) – The URL to the manifest file.

Returns:

A tuple of TestCropRow objects.

Return type:

tuple[TestCropRow, …]

cellmap_segmentation_challenge.utils.get_class_relations(csv_path: str = '/opt/hostedtoolcache/Python/3.12.8/x64/lib/python3.12/site-packages/cellmap_segmentation_challenge/utils/classes.csv', named_classes: list[str] | None = None)[source]#
Parameters:
  • csv_path (str)

  • named_classes (list[str] | None)

cellmap_segmentation_challenge.utils.get_dataloader(datasplit_path: str, classes: Sequence[str], batch_size: int, array_info: Mapping[str, Sequence[int | float]] | None = None, input_array_info: Mapping[str, Sequence[int | float]] | None = None, target_array_info: Mapping[str, Sequence[int | float]] | None = None, spatial_transforms: Mapping[str, Any] | None = None, iterations_per_epoch: int = 1000, random_validation: bool = False, device: str | device | None = None) tuple[CellMapDataLoader, CellMapDataLoader][source]#

Get the train and validation dataloaders.

This function gets the train and validation dataloaders for the given datasplit file, classes, batch size, array info, spatial transforms, iterations per epoch, number of workers, and device.

Parameters:
  • datasplit_path (str) – Path to the datasplit file that defines the train/val split the dataloader should use.

  • classes (Sequence[str]) – List of classes to segment.

  • batch_size (int) – Batch size for the dataloader.

  • array_info (Optional[Mapping[str, Sequence[int | float]]]) – Dictionary containing the shape and scale of the data to load for the input and target. Either array_info or input_array_info & target_array_info must be provided.

  • input_array_info (Optional[Mapping[str, Sequence[int | float]]]) – Dictionary containing the shape and scale of the data to load for the input.

  • target_array_info (Optional[Mapping[str, Sequence[int | float]]]) – Dictionary containing the shape and scale of the data to load for the target.

  • spatial_transforms (Optional[Mapping[str, any]]) – Dictionary containing the spatial transformations to apply to the data. For example the dictionary could contain transformations like mirror, transpose, and rotate.

  • { (spatial_transforms =) –

    # 3D

    # Probability of applying mirror for each axis # Values range from 0 (no mirroring) to 1 (will always mirror)

    ”mirror”: {“axes”: {“x”: 0.5, “y”: 0.5, “z”: 0.5}},

    # Specifies the axes that will be invovled in the trasposition

    ”transpose”: {“axes”: [“x”, “y”, “z”]},

    # Defines rotation range for each axis. # Rotation angle for each axis is randomly chosen within the specified range (-180, 180).

    ”rotate”: {“axes”: {“x”: [-180, 180], “y”: [-180, 180], “z”: [-180, 180]}},

    # 2D (used when there is no z axis) # “mirror”: {“axes”: {“x”: 0.5, “y”: 0.5}}, # “transpose”: {“axes”: [“x”, “y”]}, # “rotate”: {“axes”: {“x”: [-180, 180], “y”: [-180, 180]}},

  • }

  • iterations_per_epoch (int)

  • random_validation (bool)

  • device (str | device | None)

Return type:

tuple[CellMapDataLoader, CellMapDataLoader]

iterations_per_epochint

Number of iterations per epoch.

random_validationbool

Whether or not to randomize the validation data draws. Useful if not evaluating on the entire validation set everytime. Defaults to False.

deviceOptional[str or torch.device]

Device to use for training. If None, defaults to “cuda” if available, or “mps” if available, or “cpu”.

Returns:

Tuple containing the train and validation dataloaders.

Return type:

tuple[torch.utils.data.DataLoader, torch.utils.data.DataLoader]

Parameters:
  • datasplit_path (str)

  • classes (Sequence[str])

  • batch_size (int)

  • array_info (Mapping[str, Sequence[int | float]] | None)

  • input_array_info (Mapping[str, Sequence[int | float]] | None)

  • target_array_info (Mapping[str, Sequence[int | float]] | None)

  • spatial_transforms (Mapping[str, Any] | None)

  • iterations_per_epoch (int)

  • random_validation (bool)

  • device (str | device | None)

cellmap_segmentation_challenge.utils.get_test_crops() tuple[CropRow, ...][source]#
Return type:

tuple[CropRow, …]

cellmap_segmentation_challenge.utils.get_tested_classes(csv_path: str = '/opt/hostedtoolcache/Python/3.12.8/x64/lib/python3.12/site-packages/cellmap_segmentation_challenge/utils/tested_classes.csv')[source]#

Get the classes that will be tested for the challenge.

Parameters:

csv_path (str, optional) – The path to the csv file, by default “tested_classes.csv”

Returns:

A list of the classes that have been tested.

Return type:

list[str]

cellmap_segmentation_challenge.utils.load_safe_config(config_path, force_safe=True)[source]#

Loads the configuration script at config_path after verifying its safety. If force_safe is True, raises an error if the script is deemed unsafe.

cellmap_segmentation_challenge.utils.make_datasplit_csv(classes: list[str] = ['nuc', 'mito'], force_all_classes: bool | str = False, validation_prob: float = 0.1, datasets: list[str] = ['*'], crops: list[str] = ['*'], search_path: str = '/opt/hostedtoolcache/Python/3.12.8/x64/lib/python3.12/data/{dataset}/{dataset}.zarr/recon-1/{name}', raw_name: str = 'em/fibsem-uint8', crop_name: str = 'labels/groundtruth/{crop}/{label}', csv_path: str = 'datasplit.csv', dry_run: bool = False)[source]#

Make a datasplit csv file for the given classes and datasets.

Parameters:
  • classes (list[str], optional) – The classes to include in the csv, by default [“nuc”, “mito”]

  • force_all_classes (bool | str, optional) – If True, force all classes to be present in the training/validation datasets. If False, as long as at least one requested class is present, a crop will be included. If “train” or “validate”, force all classes to be present in the training or validation datasets, respectively. By default False.

  • validation_prob (float, optional) – The probability of a dataset being in the validation set, by default 0.1

  • datasets (list[str], optional) – The datasets to include in the csv, by default [“*”], which includes all datasets

  • crops (list[str], optional) – The crops to include in the csv, by default all crops are included. Otherwise, only the crops in the list are included.

  • search_path (str, optional) – The search path to use to find the datasets, by default SEARCH_PATH

  • raw_name (str, optional) – The name of the raw data, by default RAW_NAME

  • crop_name (str, optional) – The name of the crop, by default CROP_NAME

  • csv_path (str, optional) – The path to write the csv file to, by default “datasplit.csv”

  • dry_run (bool, optional) – If True, do not write the csv file - just return the found datapaths. By default False

cellmap_segmentation_challenge.utils.make_s3_datasplit_csv(classes: list[str] = ['nuc', 'mito'], force_all_classes: bool | str = False, validation_prob: float = 0.1, datasets: list[str] = ['*'], crops: list[str] = ['*'], csv_path: str = 'datasplit.csv', dry_run: bool = False, **kwargs)[source]#

Make a datasplit csv file for the given classes and datasets.

Parameters:
  • classes (list[str], optional) – The classes to include in the csv, by default [“nuc”, “mito”]

  • force_all_classes (bool | str, optional) – If True, force all classes to be present in the training/validation datasets. If False, as long as at least one requested class is present, a crop will be included. If “train” or “validate”, force all classes to be present in the training or validation datasets, respectively. By default False.

  • validation_prob (float, optional) – The probability of a dataset being in the validation set, by default 0.1

  • datasets (list[str], optional) – The datasets to include in the csv, by default [“*”], which includes all datasets

  • crops (list[str], optional) – The crops to include in the csv, by default all crops are included. Otherwise, only the crops in the list are included.

  • csv_path (str, optional) – The path to write the csv file to, by default “datasplit.csv”

  • dry_run (bool, optional) – If True, do not write the csv file - just return the found datapaths. By default False

  • **kwargs (dict) – Additional keyword arguments will be unused. Kept for compatibility with make_datasplit_csv.