YAML Configuration
cellmap_flow_yaml lets you define and run multiple models from a single YAML file.
It is the recommended way to launch inference jobs, and the same YAML format is used by the blockwise processor (cellmap_flow_blockwise).
Usage
# Run inference
cellmap_flow_yaml config.yaml
# Validate without running
cellmap_flow_yaml config.yaml --validate-only
# List available model types
cellmap_flow_yaml --list-types
# Set log level
cellmap_flow_yaml config.yaml --log-level DEBUG
YAML Structure
A configuration file has the following top-level fields:
Field |
Required |
Description |
|---|---|---|
|
Yes |
Path to the input dataset (zarr/n5). |
|
Yes |
Project billing group. |
|
No |
Job queue (default: |
|
Yes |
Dict or list of model entries (see below). |
|
No |
Input normalizers and postprocessors. |
|
No |
Wrap raw data in neuroglancer (default: |
|
No |
Output zarr path (used by blockwise processing). |
|
No |
Task name (used by blockwise processing). |
|
No |
Number of GPU workers (blockwise). |
|
No |
Number of CPU workers (blockwise). |
|
No |
Temporary directory for intermediate files. |
|
No |
List of bounding boxes to process (blockwise). |
|
No |
Write each bounding box to a separate zarr (blockwise). |
Model Entries
Each model entry requires a type field and the parameters for that model type.
Use cellmap_flow_yaml --list-types to see all available types and their required parameters.
Models can be specified as a dict (keys become model names) or a list (each entry must include a name field).
Dict format (recommended):
models:
my_mito_model:
type: fly
checkpoint: /path/to/checkpoint
resolution: 16
classes:
- mito
my_dacapo_model:
type: dacapo
run_name: my_run
iteration: 100
List format:
models:
- name: my_mito_model
type: fly
checkpoint: /path/to/checkpoint
resolution: 16
classes:
- mito
Available Model Types
Type |
Class |
Key Parameters |
|---|---|---|
|
ScriptModelConfig |
|
|
DaCapoModelConfig |
|
|
FlyModelConfig |
|
|
BioModelConfig |
|
|
CellMapModelConfig |
|
Common optional parameters: name, scale.
Normalizers and Postprocessors
Define input normalization and output postprocessing under json_data:
json_data:
input_norm:
MinMaxNormalizer:
min_value: 0
max_value: 250
invert: false
LambdaNormalizer:
expression: "x*2-1"
postprocess:
DefaultPostprocessor:
clip_min: 0
clip_max: 1.0
bias: 0.0
multiplier: 127.5
ThresholdPostprocessor:
threshold: 0.5
Normalizers are applied in order before inference. Postprocessors are applied in order after inference.
Bounding Boxes
For blockwise processing, you can specify regions of interest:
bounding_boxes:
- offset: [59611, 52237, 5627]
shape: [4674, 11566, 10067]
- offset: [64285, 26408, 15695]
shape: [11626, 12405, 26847]
Set separate_bounding_boxes_zarrs: true to write each bounding box to its own zarr subdirectory (box_1, box_2, etc).
Examples
Minimal configuration
data_path: /nrs/cellmap/data/my_dataset/my_dataset.zarr/recon-1/em/fibsem-uint8
charge_group: cellmap
queue: gpu_h100
models:
my_model:
type: dacapo
run_name: my_run
iteration: 50000
Full configuration with normalizers
data_path: /nrs/cellmap/data/jrc_mus-salivary-1/jrc_mus-salivary-1.zarr/recon-1/em/fibsem-uint8
queue: gpu_h100
charge_group: cellmap
json_data:
input_norm:
MinMaxNormalizer:
min_value: 0
max_value: 250
invert: false
LambdaNormalizer:
expression: "x*2-1"
postprocess:
DefaultPostprocessor:
clip_min: 0
clip_max: 1.0
bias: 0.0
multiplier: 127.5
ThresholdPostprocessor:
threshold: 127.5
models:
model_tmp1:
type: fly
checkpoint: /path/to/model_checkpoint_362000
resolution: 16
classes:
- mito
Blockwise processing
data_path: /nrs/cellmap/data/jrc_mus-salivary-1/jrc_mus-salivary-1.zarr/recon-1/em/fibsem-uint8
output_path: /path/to/output.zarr
task_name: cellmap_flow_mito_task
charge_group: cellmap
queue: gpu_h100
workers: 14
cpu_workers: 12
tmp_dir: /path/to/tmp
models:
- name: model_tmp1
type: fly
channels:
- mito
checkpoint_path: /path/to/model_checkpoint_362000
input_size: [178, 178, 178]
input_voxel_size: [16, 16, 16]
output_size: [56, 56, 56]
output_voxel_size: [16, 16, 16]
bounding_boxes:
- offset: [59611, 52237, 5627]
shape: [4674, 11566, 10067]
- offset: [64285, 26408, 15695]
shape: [11626, 12405, 26847]
json_data:
input_norm:
MinMaxNormalizer:
invert: false
max_value: 250
min_value: 0
LambdaNormalizer:
expression: "x*2-1"
postprocess:
ThresholdPostprocessor:
threshold: 0.5
Run blockwise processing with:
cellmap_flow_blockwise config.yaml
cellmap_flow_blockwise config.yaml --log-level DEBUG