Data Troubleshooting

Some common issues may arise when writing or converting data to use with Vitessce.

AnnData-Zarr paths

When an AnnData object is written to a Zarr store (e.g., via adata.write_zarr), the columns and keys in the original object (e.g., adata.obs["leiden"] or adata.obsm["X_umap"]) become relative POSIX-style paths (e.g., obs/leiden and obsm/X_umap) in the Zarr store.

AnnData-Zarr obsFeatureMatrix chunking strategy

A benefit of the Zarr format is that arrays can be chunked and stored in small pieces. In Vitessce, we leverage the chunking features of Zarr to load only the subset of the obsFeatureMatrix which is required for each visualization.

For instance, if a gene is selected to color the points in the scatterplot or spatial views, we only load the chunks containing the gene of interest.

However, a poor chunking strategy (e.g., each chunk containing too many genes) can reduce the efficiency of this approach and result in too much data being requested when a gene is selected.

A chunks argument can be passed to the AnnData write_zarr method to resolve this:

# ...
VAR_CHUNK_SIZE = 10 # VAR_CHUNK_SIZE should be small
adata.write_zarr(out_path, chunks=(adata.shape[0], VAR_CHUNK_SIZE))

Zarr dtypes

Vitessce uses Zarrita.js to load Zarr data. Zarrita.js currently supports a subset of NumPy data types, so ensure that the types used in the arrays and data frames of your AnnData store are supported (otherwise cast using np.astype or pd.astype). In addition to the Zarr.js data types, Vitessce supports loading AnnData string columns with vlen-utf8 or |O types.

To automatically do this casting for AnnData objects, the vitessce Python package provides the optimize_adata function:

from vitessce.data_utils import optimize_adata
# ...
adata = optimize_adata(adata)
# ...

Images and Segmentation Bitmasks (Label Images)

Multi-resolution (Pyramidal) Representation

In order for Vitessce to load large images (from any supported image format), the image must be stored in a multi-resolution (i.e., pyramidal) and tiled form. Pyramidal images enable Vitessce to load only the subset of data necessary (image tiles at a particular level of resolution), based on the user's current viewport (i.e., pan and zoom state). Otherwise, without a pyramidal and tiled image, Vitessce must load all image pixels to visualize the image at all, which can quickly result in errors or crashes due to surpassing the memory limits of the web browser. See the format-specific notes below for more information.

OME-TIFF

Multi-resolution OME-TIFF

A quick way to check if an OME-TIFF image is already pyramidal is to open it in FIJI. If the Bio-Formats Series Options dialog appears (with checkboxes for selection of a subset of resolutions to open), then the image is already pyramidal. Alternatively, use the tiffcomment command-line tool to check the OME-XML metadata for an indication that the image contains multiple resolutions, or use the tifffile Python package.

To create a multi-resolution OME-TIFF image, we recommend using the tool bioformats2raw followed by raw2ometiff, developed by the Open Microscopy Environment. Use the parameter --resolutions of bioformats2raw. For example, to create a pyramid with six level, specify --resolutions 6.

OME-TIFF offsets

When using OME-TIFF files with Vitessce, performance can be improved by creating an offsets.json file to accompany each OME-TIFF file. This "offsets" file contains an index of byte offsets to different elements within the OME-TIFF file. These byte offsets enable Vitessce to directly navigate to subsets of data within the OME-TIFF file, avoiding the need to seek through the entire file. The generate-tiff-offsets Python package or web-based tool can be used to generate an offsets.json file for an OME-TIFF image.

Then, configure Vitessce using the offsetsUrl option of the image.ome-tiff or obsSegmentations.ome-tiff file types.

For more information, see the Viv paper at Manz et al. Nature Methods 2022 which introduces the concept of an Indexed OME-TIFF file and benchmarks the approach.

OME-TIFF compression

Vitessce can load OME-TIFFs which use the following compression methods:

No compression
Packbits
LZW
Deflate (with floating point or horizontal predictor support)
JPEG
LERC (with additional Deflate compression support)

This is based on Vitessce using Viv, as Viv internally uses Geotiff.js to load data from OME-TIFF files.

RGB vs. multiplex

To determine whether an OME-TIFF image should be interpreted as red-green-blue (RGB, as a standard camera image would be) versus multiplexed, Vitessce uses the PhotometricInterpretation TIFF tag. A value of 1 means "black is zero" (i.e., multi-channel/grayscale, where zero values should be rendered using the color black), whereas 2 means RGB. To override the metadata in the image, the photometricInterpretation coordination type can be used (with value 'RGB' or 'BlackIsZero').

Alignment, coordinate transformations, and physical size

Physical size metadata

If the OME-XML metadata contains PhysicalSizeX, PhysicalSizeXUnit, PhysicalSizeY, and PhysicalSizeYUnit, then the physical size will be used for scaling. These values define the physical size and unit of an individual pixel within the image (e.g., that one pixel has a physical size of 1x1 micron).

Coordinate transformations

Optionally, coordinate transformations can be defined using the coordinateTransformations option of the image.ome-tiff or obsSegmentations.ome-tiff file types, which will be interpreted according to the OME-NGFF v0.4 coordinateTransformations spec. The order of the transformations parameters must correspond to the order of the dimensions in the image (i.e., must match the DimensionOrder within the OME-XML metadata). For example, to scale by 2x in the X and Y dimensions for an image with a DimensionOrder of XYZCT, use "scale": [2.0, 2.0, 1.0, 1.0, 1.0]. For example, to translate by 3 and 4 units in the X and Y dimensions, respectively, use "translation": [3.0, 4.0, 0.0, 0.0, 0.0].

Channel names

Vitessce will use the channel names present within the OME-XML metadata and will display these within the user interface. To edit the channel names, tools such as tiffcomment can be used.

OME-NGFF

Also known as OME-Zarr.

SpatialData Images and Labels

SpatialData uses OME-NGFF to store images and label images (i.e., segmentation bitmask images). Thus, the following points apply not only to standalone OME-NGFF images but also to the Images and Labels elements within SpatialData objects.

Multi-resolution OME-NGFF

As noted above, Vitessce requires large-scale images to use multi-resolution representations on-disk. When using the spatialdata Python package, images may not be saved as multi-resolution/multi-scale by default. Use the scale_factors parameter of the Image2DModel.parse and Labels2DModel.parse functions as needed to ensure that the OME-NGFF images are stored as multi-resolution.

Non-power of 2 pyramid steps

Note that Vitessce does not yet support multi-resolution OME-NGFF images with a scaling factor other than 2. As SpatialData Image and Labels elements are stored in OME-NGFF format, this point applies to both OME-NGFFs contained within SpatialData objects and standalone OME-NGFF Zarr stores.

Supported versions

Vitessce currently supports up to OME-NGFF spec v0.4.

Supported features

Vitessce supports OME-NGFF images saved as Zarr stores and a subset of OME-NGFF features via the image.ome-zarr file type. The following table lists the support for different OME-NGFF features:

Feature	Supported by Vitessce
Downsampling along Z axis	N
`omero` field	Y
multiscales with a scaling factor other than 2	N
URL (not only S3)	Y
3D view	Y
labels	N
HCS plate	N

To compare Vitessce to other OME-NGFF clients, see the table listing the OME-NGFF features supported by other clients. We welcome feature requests or pull requests to add support for the remaining features to Vitessce.

Metadata requirements

The omero metadata field must be present. omero.channels and omero.rdefs fields provide metadata that Vitessce uses for the initial rendering settings and must be present.

RGB vs. multiplex

For OME-NGFF images, Vitessce uses the field omero.rdefs.model to determine whether to interpret the image as RGB vs. multiplexed. When model is 'color', the image is interpreted as RGB; otherwise, it will be considered multiplexed. To override the metadata in the image, the photometricInterpretation coordination type can be used (with value 'RGB' or 'BlackIsZero').

Coordinate transformations

Optionally, coordinate transformations can be defined using the coordinateTransformations option of the image.ome-zarr or obsSegmentations.ome-zarr file types, which will be interpreted according to the OME-NGFF v0.4 coordinateTransformations spec. The order of the transformations parameters must correspond to the order of the dimensions in the image.

Z-axis chunking

Vitessce does not yet support chunking along the Z axis. When writing OME-Zarr stores, you may need to specify a chunks argument manually such that the Z axis only has 1 chunk.

An example writing to a Zarr store using ome-zarr-py (ome-zarr==0.2.1):

import zarr
import numpy as np
from tifffile import imread
from ome_zarr import writer

my_image = imread("my_image.tif")
my_image = np.transpose(my_image, axes=(1, 0, 3, 2)) # zcxy to czyx

z_root = zarr.open_group("my_image.zarr", mode = "w")

default_window = {
    "start": 0,
    "min": 0,
    "max": 65_535, # may need to change depending on the numpy dtype of the my_image array
    "end": 65_535 # may need to change depending on the numpy dtype of the my_image array
}

writer.write_image(
    image = my_image,
    group = z_root,
    axes = "czyx",
    omero = {
        "name": "My image",
        "version": "0.3",
        "rdefs": {},
        "channels": [
            {
                "label": f"Channel {i}",
                "color": "FFFFFF", # may want to use a different color for each channel
                "window": default_window
            } for i in range(my_image.shape[0])
        ]
    },
    chunks = (1, 1, 256, 256),
)

AnnData-Zarr paths​

AnnData-Zarr obsFeatureMatrix chunking strategy​

Zarr dtypes​

Images and Segmentation Bitmasks (Label Images)​

Multi-resolution (Pyramidal) Representation​

OME-TIFF​

Multi-resolution OME-TIFF​

OME-TIFF offsets​

OME-TIFF compression​

RGB vs. multiplex​

Alignment, coordinate transformations, and physical size​

Physical size metadata​

Coordinate transformations​

Channel names​

OME-NGFF​

SpatialData Images and Labels​

Multi-resolution OME-NGFF​

Non-power of 2 pyramid steps​

Supported versions​

Supported features​

Metadata requirements​

RGB vs. multiplex​

Coordinate transformations​

Z-axis chunking​

AnnData-Zarr paths

AnnData-Zarr obsFeatureMatrix chunking strategy

Zarr dtypes

Images and Segmentation Bitmasks (Label Images)

Multi-resolution (Pyramidal) Representation

OME-TIFF

Multi-resolution OME-TIFF

OME-TIFF offsets

OME-TIFF compression

RGB vs. multiplex

Alignment, coordinate transformations, and physical size

Physical size metadata

Coordinate transformations

Channel names

OME-NGFF

SpatialData Images and Labels

Multi-resolution OME-NGFF

Non-power of 2 pyramid steps

Supported versions

Supported features

Metadata requirements

RGB vs. multiplex

Coordinate transformations

Z-axis chunking