Skip to content

annotation hierarchy

The hierarchy described in this module exists to facilitate training machine learning models with manually curated contiguous subsets of raw data, called "crops". Conventionally, crops are annotated densely, resulting in images where each sample of the image has been given a semantic label. Crops may contain many separate label values. This "dense" representation is convenient when generating annotations, but the process of training machine learning models sometimes benefits from a more sparse representation, e.g. one where the values for each semantic class are stored in separate arrays.

This module defines a convention for representing a dense crop as a collection of multiscale images. Each multiscale image should comply with the version 0.4 of the OME-NGFF specification.

CellmapWrapper #

Bases: BaseModel, Generic[T]

A generic pydantic model that wraps the type T under the namespace "cellmap"

Attributes:

Name Type Description
cellmap T

T, but accessed via the attribute cellmap.

Examples:

from pydantic import BaseModel

class Foo(BaseModel):
    bar: int

print(CellmapWrapper[Foo](cellmap={'bar': 10}).model_dump())
# {'cellmap': {'bar': 10}}

AnnotationWrapper #

Bases: BaseModel, Generic[T]

A generic pydantic model that wraps the type T under the namespace "annotation"

Attributes:

Name Type Description
annotation T

T, but accessed via the attribute annotation.

Examples:

from pydantic import BaseModel

class Foo(BaseModel):
    bar: int

print(AnnotationWrapper[Foo](annotation={'bar': 10}).model_dump())
# {'annotation': {'bar': 10}}

SemanticSegmentation #

Bases: BaseModel

Metadata for a semantic segmentation, i.e. a segmentation where unique numerical values represent separate semantic classes.

Attributes:

Name Type Description
type Literal['semantic_segmentation']

Must be the string 'semantic_segmentation'.

encoding Dict[Union[Possibility, Literal['present']], int]

This dict represents the mapping from possibilities to numeric values. The keys must be strings in the set {'unknown', 'absent', 'present'}, and the values must be numeric values contained in the array described by this metadata.

For example, if an annotator produces an array where 0 represents unknown and 1 represents the presence of class X then encoding would take the value {'unknown': 0, 'present': 1}

InstanceSegmentation #

Bases: BaseModel

Metadata for instance segmentation, i.e. a segmentation where unique numerical values represent distinct occurrences of the same semantic class.

Attributes:

Name Type Description
type Literal['instance_segmentation']

Must be the string "instance_segmentation"

encoding Dict[Possibility, int]

This dict represents the mapping from possibilities to numeric values. The keys must be strings from the set {'unknown', 'absent'}, and the values must be numeric values contained in the array described by this metadata.

For example, if an annotator produces an array where 0 represents 'unknown' and the values 1...N represent instances of some class, then encoding would take the value {'unknown': 0}. The meaning of the non-zero values (i.e., that they represent distinct instances of a class) can be inferred from the fact that this is instance segmentation, and thus these values do not appear as keys in encoding.

AnnotationArrayAttrs #

Bases: BaseModel, Generic[TName]

The metadata for an array of annotated values.

Attributes:

Name Type Description
class_name str

The name of the semantic class annotated in this array.

complement_counts Optional[Dict[Possibility, int]]

The frequency of 'absent' and / or 'missing' values in the array data. The total number of elements in the array that represent "positive" examples can be calculated from these counts -- take the number of elements in the array minus the sum of the values in this partial histogram.

annotation_type SemanticSegmentation | InstanceSegmentation

The type of the annotation. Must be either an instance of SemanticSegmentation or an instance of InstanceSegmentation.

AnnotationGroupAttrs #

Bases: BaseModel, Generic[TName]

The metadata for an individual annotated semantic class. In a storage hierarchy like zarr or hdf5, this metadata is associated with a group-like container that contains a collection of arrays that contain the annotation data in a multiscale representation.

Attributes:

Name Type Description
class_name str

The name of the semantic class annotated by the data in this group.

annotation_type AnnotationType

The type of annotation represented by the data in this group.

CropGroupAttrs #

Bases: BaseModel, Generic[TName]

The metadata for all annotations in zarr group representing a single crop.

Attributes:

Name Type Description
version str

The version of this collection of metadata. Must be the string '0.1.0'.

name Optional[str]

The name of the crop. Optional.

description Optional[str]

A description of the crop. Optional.

created_by list[str]

The people or entities responsible for creating the annotations in the crop. If unknown, use an empty list.

created_with list[str]

The tool(s) used to create the annotations in the crop. If unknown, use an empty list.

start_date Optional[date]

The calendar date when the crop was started. Optional.

end_date Optional[date]

The calendar date when the crop was completed. Optional.

duration_days Optional[int]

The number of days spent annotating the crop. Optional.

protocol_uri Optional[str]

A URI pointing to a description of the annotation protocol used to produce the annotations. Optional.

class_names list[str]

The names of the classes that are annotated in this crop. Each element from class_names should also be the name of a Zarr group stored under the Zarr group that contains this metadata.

AnnotationArray #

Bases: ArraySpec

The specification of a zarr array that contains data from an annotation, e.g. a semantic segmentation or an instance segmentation.

Attributes:

Name Type Description
attributes CellmapWrapper[AnnotationWrapper[AnnotationArrayAttrs]]

Metadata describing the annotated class, which is nested under two outer dicts that define the namespace of this metadata, i.e. {'cellmap': {'annotation': {...}}}. See AnnotationGroupAttrs for details of the wrapped metadata.

AnnotationGroup #

Bases: GroupSpec

The specification of a multiscale group that contains a segmentation of a single class.

Attributes:

Name Type Description
attributes CellmapWrapper[AnnotationWrapper[AnnotationGroupAttrs]]

A dict describing the annotation, which is nested under two outer dicts that define the namespace of this metadata, i.e. {'cellmap': {'annotation': {...}}}. See AnnotationGroupAttrs for details of the wrapped metadata.

CropGroup #

Bases: GroupSpec

The specification of a crop group. Conventionally, a crop is a subset of a larger imaging volume that has been annotated by a human annotator. A crop may contain multiple semantic classes, which might be annotated via semantic segmentation or instance segmentation.

Attributes:

Name Type Description
attributes CellmapWrapper[AnnotationWrapper[CropGroupAttrs]]

A dict describing all annotations contained within the group, which is nested under two outer dicts that define the namespace of this metadata, i.e. {'cellmap': {'annotation': {...}}}. See CropGroupAttrs for details of the structure of the metadata.

members Mapping[str, AnnotationGroup]

A dict with keys that are strings and values that are instances of AnnotationGroup.