Skip to content

Conversation

@m-albert
Copy link
Collaborator

@m-albert m-albert commented Sep 9, 2025

PR

This PR contains work I've been wanting to discuss and push here for a while.

Highlights:

  1. Improvements to ndmeasure.label:
    1. Implementation of the idea by @jakirkham and @jni to make labels unique by using the lower and upper bits of the label array (see here). This improves parallelism at the expensive of returning non-contiguous labels.
    2. The current implementation of ndmeasure.label doesn't output contiguous (sequential) labels (also before implementing the point above). This PR adds a parameter produce_sequential_labels to dask_image.ndmeasure.label to output contiguous labels.
  2. Implementation of a chunked version of skimage.segmentation.relabel_sequential: ndmeasure.relabel_sequential (this provides the functionality e.g. of the point above, happy to discuss whether the function should be public)
  3. Implementation of ndmeasure.merge_labels_across_chunk_boundaries (details below)
    1. Use case 1: merge touching labels
    2. Use case 2: merge labels above a given IoU within the overlap.

merge_labels_across_chunk_boundaries

What's the idea behind ndmeasure.merge_labels_across_chunk_boundaries?

Visual abstract

Cases of overlap = 0 (top) and overlap > 0 (bottom)

image

Description

From the docstring:

def merge_labels_across_chunk_boundaries(
        labels,
        overlap_depth=0,
        iou_threshold=0.8,
        structure=None,
        wrap_axes=None,
        trim_overlap=True,
        produce_sequential_labels=True,
        ):
    """
    Merge labels across chunk boundaries.

    Each chunk in ``labels`` has been labeled independently, and the labels
    in different chunks may overlap. This function tries to merge labels across
    chunk boundaries using a strategy dependent on ``overlap``:
    - If ``overlap > 0``, the overlap region between chunks is used to
      determine which between each pair of chunks should be merged.
    - If ``overlap == 0``, labels that touch across chunk boundaries are merged.

    Parameters
    ----------
    labels : dask array of int
        The input labeled array, where each chunk is independently labeled.
    overlap_depth : int, optional
        The size of the overlap region between chunks, e.g. as produced by
        `dask.array.overlap` or `map_overlap`. Default is 0.
    iou_threshold : float, optional
        If ``overlap > 0``, the intersection-over-union (IoU) between labels
        in the overlap region is used to determine which labels should be
        merged. If the IoU between two labels is greater than ``iou_threshold``,
        they are merged. Default is 0.8.
    structure : array of bool, optional
        Structuring element for determining connectivity. If None, a
        cross-shaped structuring element is used.
    wrap_axes : tuple of int, optional
        Should labels be wrapped across array boundaries, and if so which axes.
        - (0,) only wrap over the 0th axis.
        - (0, 1) wrap over the 0th and 1st axis.
        - (0, 1, 3)  wrap over 0th, 1st and 3rd axis.
    trim_overlap : bool, optional
        If True, the overlap regions are trimmed from the output labels.
        Default is True.
    """

Example

Consider the following problem.

The user wants to segment a large input image, i.e. a dask array.

import numpy as np
import dask.array as da
import skimage
import matplotlib.pyplot as plt

dim = da.from_array(
    skimage.data.cells3d()[:, 1, :, :].max(axis=0), chunks=(110, 110))
image

and has a segmentation method that cannot be applied on the entire image (e.g. memory or method constraints) but works well on regions of the input image. The user can use map_overlap to apply the segmentation on the chunks of the input image and include overlap so that the segmentation method performs well on the chunk boundaries:

model = models.Cellpose(gpu=False, model_type='cyto')

dseg = dim.map_overlap(
        lambda x: model.eval(x, diameter=None, channels=[0, 0])[0],
        dtype=np.uint16,
        depth=10,
    ).persist()

plt.figure()
plt.imshow(dseg)
image

At this point, the user faces two challenges:

  1. the obtained segmentation labels show boundary artefacts and
  2. they are not unique.

This is where ndmeasure.merge_labels_across_chunk_boundaries comes in to merge labels across chunk boundaries.

First use case / usage pattern

Similarly to the implementation of ndmeasure.label and e.g. the approach provided in distributed cellpose (ping @GFleishman), one approach for merging labels consists in merging labels that touch across chunk boundaries. ndmeasure.merge_labels_across_chunk_boundaries allows to do this in a way that's agnostic of the segmentation method:

import dask_image.ndmeasure

dseg_merged = dask_image.ndmeasure.merge_labels_across_chunk_boundaries(
    dseg
)['labels']

plt.figure()
plt.imshow(dseg_merged)
image

Second use case

The approach above joins all labels that form part of the same connected component in a boundary slice of thickness one. This can lead to artefacts. In fact, the example image above contains two merged cells which are clearly separated in the chunkwise segmentation. To separate these, dask_image.ndmeasure.merge_labels_across_chunk_boundaries has the option to merge labels based on the IoU within an overlap region.

dseg = da.overlap.overlap(
        dim, depth={0: 10, 1: 10},
        boundary='none'
    ).map_blocks(
        lambda x: model.eval(x, diameter=None, channels=[0, 0])[0],
        dtype=np.uint16
    ).persist()

dseg_merged = dask_image.ndmeasure.merge_labels_across_chunk_boundaries(
    dseg, overlap_depth=10, iou_threshold=0.8,
)['labels'].compute(scheduler='single-threaded')

plt.figure()
plt.imshow(dseg_merged)
image

The labels of the two cells below which previously had been joined now remain separated.

Finally

In my personal projects it's been useful to have a function that merges labels across chunks. I think dask-image could be a good place for providing such functionality to the community and I've had some offline feedback at meetings that this could be useful. Happy about any further feedback / discussion over here!

…w/high bit encoding rather than additive scheme
…includes new functionality) and relabel_sequential. merge_labels_across_chunk_boundaries. Refactor ndmeasure.label to use the new functions. Rewrite relabel_blocks to handle sparse and large values relabeling efficiently. Add tests for new functionality.
@jni
Copy link
Contributor

jni commented Sep 19, 2025

Hey @m-albert I am currently very overwhelmed with stuff but I wanted to say this is awesome 🤩 and something I have wanted for about 15 years!

@m-albert
Copy link
Collaborator Author

Thanks @jni 🤗 Glad you like it and that you had the same idea!

Regarding this PR I still need to get an (obscurely) failing test under control...

@m-albert
Copy link
Collaborator Author

m-albert commented Oct 27, 2025

Just pasting these svgs here that ChatGPT created 🤯

image
SVG code
<svg xmlns="http://www.w3.org/2000/svg" width="660" height="260" font-family="sans-serif">

  <!-- Panel titles -->
  <text x="30" y="20" font-size="12" font-weight="bold">Before merge</text>
  <text x="30" y="35" font-size="11">each chunk labeled independently</text>

  <text x="390" y="20" font-size="12" font-weight="bold">After merge</text>
  <text x="390" y="35" font-size="11">label identities unified across chunks</text>

  <!-- LEFT PANEL (before merge) -->
  <!-- Taller frame: y=50 to y=240 (height 190) -->
  <rect x="20" y="50" width="260" height="190" fill="none" stroke="#000" stroke-width="1.5"/>

  <!-- chunk boundary (before) -->
  <line x1="150" y1="50" x2="150" y2="240"
        stroke="#777" stroke-width="2"
        stroke-dasharray="6,4"/>

  <text x="120" y="236" font-size="10" fill="#555" text-anchor="middle">chunk A</text>
  <text x="180" y="236" font-size="10" fill="#555" text-anchor="middle">chunk B</text>
  <text x="150" y="46" font-size="10" fill="#555" text-anchor="middle">chunk boundary</text>

  <!-- Define clipping regions for before-merge panel -->
  <clipPath id="leftClip_pre">
    <rect x="20" y="50" width="130" height="190"/>
  </clipPath>
  <clipPath id="rightClip_pre">
    <rect x="150" y="50" width="130" height="190"/>
  </clipPath>

  <!-- Object crossing boundary, before merge -->
  <!-- ID 5 (blue), chunk A side -->
  <ellipse cx="140" cy="120" rx="45" ry="35"
           fill="#6fa8dc" fill-opacity="0.35"
           stroke="#1c4587" stroke-width="1.5"
           clip-path="url(#leftClip_pre)"/>
  <!-- ID 12 (green), chunk B side -->
  <ellipse cx="160" cy="120" rx="45" ry="35"
           fill="#93c47d" fill-opacity="0.35"
           stroke="#274e13" stroke-width="1.5"
           clip-path="url(#rightClip_pre)"/>

  <!-- IDs before merge for big object -->
  <text x="110" y="125" font-size="12" font-weight="bold" fill="#1c4587" text-anchor="middle">ID 5</text>
  <text x="190" y="125" font-size="12" font-weight="bold" fill="#274e13" text-anchor="middle">ID 12</text>

  <!-- Another object fully inside chunk A -->
  <circle cx="80" cy="90" r="20"
          fill="#ffd966" fill-opacity="0.6"
          stroke="#7f6000" stroke-width="1.5"/>
  <text x="80" y="95" font-size="11" font-weight="bold" fill="#7f6000" text-anchor="middle">ID 7</text>

  <!-- Second pair of touching objects near the boundary (before merge) -->
  <!-- ID 8, chunk A (pink) -->
  <circle cx="140" cy="180" r="18"
          fill="#c27ba0" fill-opacity="0.35"
          stroke="#741b47" stroke-width="1.5"
          clip-path="url(#leftClip_pre)"/>
  <!-- ID 22, chunk B (orange) -->
  <circle cx="165" cy="185" r="18"
          fill="#f6b26b" fill-opacity="0.35"
          stroke="#783f04" stroke-width="1.5"
          clip-path="url(#rightClip_pre)"/>

  <!-- IDs before merge for small objects -->
  <text x="110" y="187" font-size="11" font-weight="bold" fill="#741b47" text-anchor="middle">ID 8</text>
  <text x="205" y="192" font-size="11" font-weight="bold" fill="#783f04" text-anchor="middle">ID 22</text>

  <!-- Arrow between panels with explanatory text above -->
  <line x1="320" y1="150" x2="340" y2="150"
        stroke="#000" stroke-width="2" marker-end="url(#arrow)"/>
  <text x="330" y="125" font-size="10" text-anchor="middle">Merge touching</text>
  <text x="330" y="137" font-size="10" text-anchor="middle">IDs</text>

  <defs>
    <marker id="arrow" markerWidth="10" markerHeight="10" refX="8" refY="3" orient="auto">
      <polygon points="0,0 8,3 0,6" fill="#000"/>
    </marker>
  </defs>

  <!-- RIGHT PANEL (after merge) -->
  <rect x="380" y="50" width="260" height="190" fill="none" stroke="#000" stroke-width="1.5"/>

  <!-- same chunk boundary still exists physically in memory layout -->
  <line x1="510" y1="50" x2="510" y2="240"
        stroke="#777" stroke-width="2"
        stroke-dasharray="6,4"/>

  <text x="480" y="236" font-size="10" fill="#555" text-anchor="middle">chunk A</text>
  <text x="540" y="236" font-size="10" fill="#555" text-anchor="middle">chunk B</text>

  <!-- Define clipping regions for after-merge panel -->
  <clipPath id="leftClip_post">
    <rect x="380" y="50" width="130" height="190"/>
  </clipPath>
  <clipPath id="rightClip_post">
    <rect x="510" y="50" width="130" height="190"/>
  </clipPath>

  <!-- Merged big object spanning both chunks -->
  <!-- After merge, IDs 5 and 12 are unified as ID 5.
       Both sides use ID 5's style (blue),
       still clipped per chunk, and we keep only original per-chunk outlines. -->
  <!-- Left side (chunk A) -->
  <ellipse cx="500" cy="120" rx="45" ry="35"
           fill="#6fa8dc" fill-opacity="0.35"
           stroke="#1c4587" stroke-width="1.5"
           clip-path="url(#leftClip_post)"/>
  <!-- Right side (chunk B, now same ID/color) -->
  <ellipse cx="520" cy="120" rx="45" ry="35"
           fill="#6fa8dc" fill-opacity="0.35"
           stroke="#1c4587" stroke-width="1.5"
           clip-path="url(#rightClip_post)"/>

  <!-- Unified ID after merge (big object) -->
  <text x="510" y="125" font-size="12" font-weight="bold" fill="#1c4587" text-anchor="middle">ID 5</text>
  <text x="510" y="140" font-size="9" fill="#1c4587" text-anchor="middle">(merged)</text>

  <!-- The other object that didn't cross still keeps its label -->
  <circle cx="440" cy="90" r="20"
          fill="#ffd966" fill-opacity="0.6"
          stroke="#7f6000" stroke-width="1.5"/>
  <text x="440" y="95" font-size="11" font-weight="bold" fill="#7f6000" text-anchor="middle">ID 7</text>

  <!-- Merged small objects: IDs 8 and 22 touched, so now it's one ID (ID 8).
       Both sides use ID 8's style (pink),
       still clipped per chunk, only original per-chunk outlines. -->
  <!-- Left side (chunk A) -->
  <circle cx="500" cy="185" r="18"
          fill="#c27ba0" fill-opacity="0.35"
          stroke="#741b47" stroke-width="1.5"
          clip-path="url(#leftClip_post)"/>
  <!-- Right side (chunk B), now same ID/color -->
  <circle cx="525" cy="190" r="18"
          fill="#c27ba0" fill-opacity="0.35"
          stroke="#741b47" stroke-width="1.5"
          clip-path="url(#rightClip_post)"/>

  <!-- Unified ID after merge (small objects) -->
  <text x="510" y="195" font-size="11" font-weight="bold" fill="#741b47" text-anchor="middle">ID 8</text>
  <text x="510" y="212" font-size="9" fill="#741b47" text-anchor="middle">(merged)</text>

</svg>

and

<svg xmlns="http://www.w3.org/2000/svg" width="700" height="260" font-family="sans-serif">

  <!-- Panel titles -->
  <text x="30" y="20" font-size="12" font-weight="bold">Before merge</text>
  <text x="30" y="35" font-size="11">each chunk labeled independently (with overlap)</text>

  <text x="380" y="20" font-size="12" font-weight="bold">After merge</text>
  <text x="380" y="35" font-size="11">high-IoU overlaps become a single ID</text>

  <!-- LEFT PANEL (before merge) -->
  <rect x="20" y="50" width="260" height="190" fill="none" stroke="#000" stroke-width="1.5"/>

  <!-- overlap band -->
  <rect x="130" y="50" width="40" height="190"
        fill="#ddd" fill-opacity="0.6" stroke="none"/>
  <text x="150" y="47" font-size="10" fill="#555" text-anchor="middle">overlap</text>

  <!-- chunk boundary -->
  <line x1="150" y1="50" x2="150" y2="240"
        stroke="#777" stroke-width="2"
        stroke-dasharray="6,4"/>
  <text x="120" y="236" font-size="10" fill="#555" text-anchor="middle">chunk A</text>
  <text x="180" y="236" font-size="10" fill="#555" text-anchor="middle">chunk B</text>

  <!-- BIG OBJECT BEFORE MERGE:
       Show both full outlines.
       ID 12 first (behind, green), then ID 5 on top (blue). -->
  <!-- ID 12 (behind) -->
  <ellipse cx="160" cy="120" rx="45" ry="35"
           fill="#93c47d" fill-opacity="0.35"
           stroke="#274e13" stroke-width="1.5"/>
  <!-- ID 5 (front) -->
  <ellipse cx="140" cy="120" rx="45" ry="35"
           fill="#6fa8dc" fill-opacity="0.35"
           stroke="#1c4587" stroke-width="1.5"/>

  <!-- Labels for big object (before merge) -->
  <text x="110" y="125" font-size="12" font-weight="bold" fill="#1c4587" text-anchor="middle">ID 5</text>
  <text x="190" y="125" font-size="12" font-weight="bold" fill="#274e13" text-anchor="middle">ID 12</text>

  <!-- Object fully inside chunk A -->
  <circle cx="80" cy="90" r="20"
          fill="#ffd966" stroke="#7f6000" stroke-width="1.5"/>
  <text x="80" y="95" font-size="11" font-weight="bold" fill="#7f6000" text-anchor="middle">ID 7</text>

  <!-- SMALL PAIR BEFORE MERGE:
       Draw both full outlines.
       ID 22 under (orange), ID 8 on top (pink). -->
  <circle cx="165" cy="180" r="18"
          fill="#f6b26b" fill-opacity="0.35"
          stroke="#783f04" stroke-width="1.5"/>
  <circle cx="140" cy="175" r="18"
          fill="#c27ba0" fill-opacity="0.35"
          stroke="#741b47" stroke-width="1.5"/>

  <text x="110" y="182" font-size="11" font-weight="bold" fill="#741b47" text-anchor="middle">ID 8</text>
  <text x="205" y="187" font-size="11" font-weight="bold" fill="#783f04" text-anchor="middle">ID 22</text>

  <!-- IoU caption -->
  <text x="150" y="70" font-size="9" fill="#000" text-anchor="middle">compare overlap → IoU</text>
  <text x="150" y="82" font-size="9" fill="#000" text-anchor="middle">merge if IoU ≥ τ</text>

  <!-- Arrow + caption between panels -->
  <line x1="320" y1="140" x2="340" y2="140"
        stroke="#000" stroke-width="2" marker-end="url(#arrow)"/>
  <text x="330" y="115" font-size="10" text-anchor="middle">merge high-IoU</text>
  <text x="330" y="127" font-size="10" text-anchor="middle">labels in overlap</text>

  <defs>
    <marker id="arrow" markerWidth="10" markerHeight="10" refX="8" refY="3" orient="auto">
      <polygon points="0,0 8,3 0,6" fill="#000"/>
    </marker>
  </defs>

  <!-- RIGHT PANEL (after merge) -->
  <rect x="380" y="50" width="260" height="190" fill="none" stroke="#000" stroke-width="1.5"/>

  <!-- overlap band -->
  <rect x="490" y="50" width="40" height="190"
        fill="#ddd" fill-opacity="0.6" stroke="none"/>
  <text x="510" y="47" font-size="10" fill="#555" text-anchor="middle">overlap</text>

  <!-- chunk boundary -->
  <line x1="510" y1="50" x2="510" y2="240"
        stroke="#777" stroke-width="2"
        stroke-dasharray="6,4"/>
  <text x="480" y="236" font-size="10" fill="#555" text-anchor="middle">chunk A</text>
  <text x="540" y="236" font-size="10" fill="#555" text-anchor="middle">chunk B</text>

  <!-- BIG OBJECT AFTER MERGE:
       IDs 5 and 12 have merged.
       Both lobes are now shown in the same blue style (ID 5),
       and the merged outline is also blue. No orange anymore. -->
  <ellipse cx="520" cy="120" rx="45" ry="35"
           fill="#6fa8dc" fill-opacity="0.35"
           stroke="#1c4587" stroke-width="1.5"/>
  <ellipse cx="500" cy="120" rx="45" ry="35"
           fill="#6fa8dc" fill-opacity="0.35"
           stroke="#1c4587" stroke-width="1.5"/>

  <!-- Unified merged outline, now blue -->
  <path d="
    M455,120
    A45,35 0 0 1 500,85
    A45,35 0 0 1 520,85
    A45,35 0 0 1 565,120
    A45,35 0 0 1 520,155
    A45,35 0 0 1 500,155
    A45,35 0 0 1 455,120
    Z"
    fill="none"
    stroke="#1c4587"
    stroke-width="1.5"/>

  <!-- merged ID label -->
  <text x="510" y="125" font-size="12" font-weight="bold" fill="#1c4587" text-anchor="middle">ID 5</text>
  <text x="510" y="140" font-size="9" fill="#1c4587" text-anchor="middle">(merged via IoU ≥ τ)</text>

  <!-- unchanged object -->
  <circle cx="440" cy="90" r="20"
          fill="#ffd966" stroke="#7f6000" stroke-width="1.5"/>
  <text x="440" y="95" font-size="11" font-weight="bold" fill="#7f6000" text-anchor="middle">ID 7</text>

  <!-- SMALL PAIR AFTER MERGE:
       They did NOT merge (IoU < τ).
       Keep them separate and colored distinctly:
       ID 22 orange, ID 8 pink, with 8 on top. -->
  <circle cx="525" cy="180" r="18"
          fill="#f6b26b" fill-opacity="0.35"
          stroke="#783f04" stroke-width="1.5"/>
  <circle cx="500" cy="175" r="18"
          fill="#c27ba0" fill-opacity="0.35"
          stroke="#741b47" stroke-width="1.5"/>

  <text x="470" y="182" font-size="11" font-weight="bold" fill="#741b47" text-anchor="middle">ID 8</text>
  <text x="565" y="187" font-size="11" font-weight="bold" fill="#783f04" text-anchor="middle">ID 22</text>

  <!-- no-merge note -->
  <text x="510" y="215" font-size="9" fill="#000" text-anchor="middle">
    no merge: IoU &lt; τ
  </text>

</svg>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants