OME-Zarr

OME-Zarr Specification

We currently target v0.5 of the official OME-Zarr specification as defined by:

https://ngff.openmicroscopy.org/0.5/index.html

As the specification evolves, we plan to add support for newer versions and additional features.

ome-writers supports writing to OME-Zarr: a newer file format for bioimaging data based on the Zarr specification, with metadata structures defined by the OME-Zarr (NGFF) specification.

Zarr is a format designed for the storage of chunked, multi-dimensional arrays and is optimized for cloud storage and parallel access. Data is comprised of chunks, which are stored as individual files within a hierarchical directory structure, (or, optionally grouped into super-chunks called "shards" in the v3 Zarr specification, with a single file per shard).

Expected Output Structure

When writing an acquisition to OME-Zarr, the expected output structure depends on your AcquisitionSettings. OME-Zarr currently supports no more than 5 dimensions (typically: T, C, Z, Y, X), per array node. Therefore, acquisitions with more than 5 dimensions will be split into multiple arrays, each stored in a separate group within the root OME-Zarr group.

Single ≤5D ImageMulti-Position & Other CollectionsMulti-well Plates (HCS)

Any acquisition with 5 or fewer dimensions will be stored as a single "multiscales" image at the AcquisitionSettings.output_path with:

output_path.ome.zarr/
├── zarr.json            # {"zarr_format": 3} group, with attributes.ome.multiscales
└── 0/                   # Full resolution array  
    ├── zarr.json        # Array metadata (standard zarr schema)
    └── c/0/1/2/3        # Chunk files

(We don't current write downsampled arrays, but this will be added in the future).

Example Code

Writing a single ≤5D image

Acquisitions that contain multiple positions (e.g., stage positions, tiled images, angles on a light-sheet microscope, etc.) or that exceed 5 dimensions in any way other will be structured as a root zarr group (currently following the transitional bioformats2raw convention), with a sub multiscales group for each position or collection member.

output_path.ome.zarr/
├── zarr.json             # Contains "bioformats2raw.layout" metadata
├── OME                   # Special group for containing OME metadata
│   ├── zarr.json         # Contains "series" metadata, listing all positions
│   └── METADATA.ome.xml  # optional OME-XML file stored within the Zarr fileset
├── 0/                    # First image in the collection (same as 5D image above)
├── 1/                    # Second image in the collection
└── ...

Example Code

Writing multiple positions

If you declare AcquisitionSettings.plate along with a position dimension (type='position') containing Positions that define plate_well/plate_column information, the output structure will follow the OME-Zarr HCS specification:

output_path.ome.zarr/
├── zarr.json              # Plate metadata
├── A/                     # Row A
│   ├── 1/                 # Well A1
│   │   ├── zarr.json      # Well metadata
│   │   ├── 0/             # Position 0 (Image with multiscales)
│   │   │   ├── zarr.json  # contains "ome.multiscales" metadata
│   │   │   ├── 0/         # Full resolution
│   │   │   └── n/         # (Downsampled levels, not currently written)
│   │   └── 1/             # Position 1  (Image with multiscales)
│   └── 2/                 # Well A2
└── B/                     # Row B
    ├── 1/                 # Well B1    
    └── ...

Example Code

Writing plates

Backends

Important

All zarr format backends currently use yaozarrs to establish the group hierarchy and group-level zarr.json documents. Only the array nodes are written using the specific backend libraries below.

tensorstore: Uses Google's tensorstore library.
acquire-zarr: Uses the acquire-zarr library.
zarr-python: Uses the reference zarr library (also known as zarr-python to disambiguate the Python library from the specification itself).
zarrs-python: Uses the rust-backed zarrs-python library, on top of zarr-python to speed up array writes.