cellmap_segmentation_challenge.utils.datasplit#

Functions

`get_csv_string`(path, classes, usage[, raw_name])	Get the csv string for a given dataset path, to be written to the datasplit csv file.
`get_dataset_counts`([classes, search_path, ...])	Get the counts of each class in each dataset.
`get_dataset_name`(raw_path[, search_path, ...])	Get the name of the dataset from the raw path.
`get_raw_path`(crop_path[, raw_name, label])	Get the path to the raw data for a given crop path.
`make_datasplit_csv`([classes, ...])	Make a datasplit csv file for the given classes and datasets.

cellmap_segmentation_challenge.utils.datasplit.get_dataset_name(raw_path: str, search_path: str = '/opt/hostedtoolcache/Python/3.12.7/x64/lib/python3.12/data/{dataset}/{dataset}.zarr/recon-1/{name}', raw_name: str = 'em/fibsem-uint8') → str[source]#

Get the name of the dataset from the raw path.

Parameters:

raw_path (str)
search_path (str)
raw_name (str)

Return type:

str

cellmap_segmentation_challenge.utils.datasplit.get_raw_path(crop_path: str, raw_name: str = 'em/fibsem-uint8', label: str = '') → str[source]#

Get the path to the raw data for a given crop path.

Parameters:

crop_path (str) – The path to the crop.
raw_name (str, optional) – The name of the raw data, by default RAW_NAME
label (str, optional) – The label class at the crop_path, by default “”

Returns:

The path to the raw data.

Return type:

str

cellmap_segmentation_challenge.utils.datasplit.get_csv_string(path: str, classes: list[str], usage: str, raw_name: str = 'em/fibsem-uint8')[source]#

Get the csv string for a given dataset path, to be written to the datasplit csv file.

Parameters:

path (str) – The path to the dataset.
classes (list[str]) – The classes present in the dataset.
usage (str) – The usage of the dataset (train or validate).
raw_name (str, optional) – The name of the raw data. Default is RAW_NAME.

Returns:

The csv string for the dataset.

Return type:

str

cellmap_segmentation_challenge.utils.datasplit.make_datasplit_csv(classes: list[str] = ['nuc', 'mito'], force_all_classes: bool | str = False, validation_prob: float = 0.1, datasets: list[str] = ['*'], crops: list[str] = ['*'], search_path: str = '/opt/hostedtoolcache/Python/3.12.7/x64/lib/python3.12/data/{dataset}/{dataset}.zarr/recon-1/{name}', raw_name: str = 'em/fibsem-uint8', crop_name: str = 'labels/groundtruth/{crop}/{label}', csv_path: str = 'datasplit.csv', dry_run: bool = False)[source]#

Make a datasplit csv file for the given classes and datasets.

Parameters:

classes (list[str], optional) – The classes to include in the csv, by default [“nuc”, “mito”]
force_all_classes (bool | str, optional) – If True, force all classes to be present in the training/validation datasets. If False, as long as at least one requested class is present, a crop will be included. If “train” or “validate”, force all classes to be present in the training or validation datasets, respectively. By default False.
validation_prob (float, optional) – The probability of a dataset being in the validation set, by default 0.1
datasets (list[str], optional) – The datasets to include in the csv, by default [“*”], which includes all datasets
crops (list[str], optional) – The crops to include in the csv, by default all crops are included. Otherwise, only the crops in the list are included.
search_path (str, optional) – The search path to use to find the datasets, by default SEARCH_PATH
raw_name (str, optional) – The name of the raw data, by default RAW_NAME
crop_name (str, optional) – The name of the crop, by default CROP_NAME
csv_path (str, optional) – The path to write the csv file to, by default “datasplit.csv”
dry_run (bool, optional) – If True, do not write the csv file - just return the found datapaths. By default False

cellmap_segmentation_challenge.utils.datasplit.get_dataset_counts(classes: list[str] = ['nuc', 'mito'], search_path: str = '/opt/hostedtoolcache/Python/3.12.7/x64/lib/python3.12/data/{dataset}/{dataset}.zarr/recon-1/{name}', raw_name: str = 'em/fibsem-uint8', crop_name: str = 'labels/groundtruth/{crop}/{label}')[source]#

Get the counts of each class in each dataset.

Parameters:

classes (list[str], optional) – The classes to include in the csv, by default [“nuc”, “mito”]
search_path (str, optional) – The search path to use to find the datasets, by default SEARCH_PATH
raw_name (str, optional) – The name of the raw data, by default RAW_NAME
crop_name (str, optional) – The name of the crop, by default CROP_NAME

Returns:

A dictionary of the counts of each class in each dataset.

Return type:

dict

cellmap_segmentation_challenge.utils.datasplit

Contents

cellmap_segmentation_challenge.utils.datasplit#