Skip to content

Median filter: incorrect border handling and slow u16/int16 path #16

@Marek55S

Description

@Marek55S

Summary

The median filter implementation in api/src/algorithms.rs computes the median using a fixed window size (2*r+1)^2 for every pixel and clears a large histogram (65,536 bins) per pixel for u16/int16 data. This leads to incorrect medians at image borders and severe performance problems on u16/int16 images.

Symptoms / Reproduction

  1. Apply the median filter with radius r > 0 to an image and inspect pixels near the image edges. Medians are frequently wrong (often 0).
  2. Run the filter on a large u16 image (e.g. 1024×1024) and observe very slow runtime and high CPU usage due to per-pixel clearing of a 65,536-bin histogram.

Actual behavior

  • The algorithm always uses window_size = (2*r+1)^2 to compute the median index, even at borders where fewer pixels are present in the kernel.
  • For u16/int16 images, the code clears a 65,536-bin histogram for every pixel (e.g., hist.fill(0)), which is extremely expensive.

Expected behavior

  • Median computation must use the actual number of pixels present in the kernel at image edges.
  • The u16/int16 path must avoid clearing the full 65,536-bin histogram per pixel and be much faster on large images.

Root cause

  • Border bug: median helpers receive a constant window_size rather than the actual number of pixels added to the histogram (total_count) for kernels overlapping image boundaries.
  • Performance: clearing the full u16 histogram per pixel is O(65536) work per pixel and dominates runtime.

Required changes

  1. Fix border correctness

    • While iterating the kernel, track the actual number of pixels added to the histogram (total_count).
    • Pass total_count into median_from_histogram_* instead of the fixed window_size.
    • Update median helper(s) to compute mid = (total_count + 1) / 2 and use that to select the median.
  2. Optimize u16/int16 performance

    • Stop calling hist.fill(0) for all 65,536 bins per pixel.
    • Suggested approaches (choose based on complexity & expected kernel sizes):
      • (a) Sliding histogram: Maintain a rolling histogram when moving the kernel horizontally so you only add/remove column pixels instead of rebuilding the whole histogram.
      • (b) Small-kernel fast path: For small kernels (area <= threshold), collect neighborhood values into a small Vec/array and use select_nth_unstable to find the median — avoids the huge histogram entirely.
      • (c) Hybrid: Use (b) for small kernels and (a) for large kernels.
  3. Optional: Parallelize per-row processing for native builds (e.g., rayon) behind a feature flag to avoid impacting wasm builds.

Implementation notes & contract

  • Keep the public function signature and behavior stable (input image buffer and radius, output same-type buffer).
  • For i16, convert to indices via an offset (e.g., index = (value as i32 + 32768) as usize) only inside the histogram logic.
  • For even total_count decide and document whether you pick the lower or upper median; existing behavior used mid = (window_size + 1) / 2 (upper median for even counts).

Acceptance criteria

  • Correct medians at image edges (unit tests pass for border cases).
  • u16/int16 median filtering on 1024×1024 images shows substantial speedup vs the current per-pixel full-histogram clear

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions