A ROS 2 package for real-time stereo depth estimation using ONNX models (S2M2, Hitnet, CREStereo). This node processes a 4-camera concatenated stream into 3D point clouds using GPU acceleration (TensorRT/CUDA).
- Multi-Model Support: Switch between S2M2, Hitnet, and CREStereo architectures.
- Parallel Processing: Option to run inference on multiple stereo pairs concurrently using NVIDIA Multi-Stream or sequentially.
- Efficient Pipeline: Handles image splitting, rectification, inference, and point cloud generation.
- Headless: Publishes compressed debug imagery to ROS topics instead of GUI windows.
- Per-Drone Depth Corrections: Support for calibration-specific baseline and disparity corrections.
- ROS 2 Humble (or compatible)
- CUDA & TensorRT (Required for GPU acceleration)
- Dependencies:
libopencv-devlibpcl-devlibyaml-cpp-devonnxruntime(GPU version recommended)
- External Packages:
cd ~/ros2_ws/src
git clone https://github.com/lis-epfl/depth_estimation_ros2 depth_estimation_ros2python3 `src/depth_estimation_ros2/scripts/download_models.py`Copy generated calibration maps into: src/depth_estimation_ros2/config/final_maps_<RES>/ e.g. final_maps_224_224.
cd ~/ros2_ws
colcon build --symlink-install --packages-select depth_estimation_ros2
source install/setup.bashcompress_images: false| Parameter | Description |
|---|---|
calibration_base_path |
Absolute path to calibration folder |
calibration_resolution |
Resolution suffix (e.g., 224_224) |
onnx_model_path |
Path to ONNX model |
transform_config_path |
Path to cam0_to_centroid.yaml |
| Parameter | Default | Description |
|---|---|---|
run_parallel |
true | Parallel ONNX inference |
model_type |
s2m2 | hitnet / crestereo / s2m2 |
execution_provider |
tensorrt | cuda / cpu |
verbose |
true | Publish debug grid |
filter_disparity |
true | S2M2 filtering |
publish_combined_pointcloud |
false | Merge clouds |
Due to small errors in stereo calibration (baseline estimation, principal point alignment, rectification), the estimated depth may have systematic errors. This package supports per-drone corrections via a depth_corrections.yaml file in each calibration folder.
The stereo depth equation is:
Z = (f × B) / d
Where Z is depth, f is focal length, B is baseline, and d is disparity.
Two correction parameters are available:
| Parameter | Effect | When to Use |
|---|---|---|
baseline_scale |
Multiplies the baseline: Z_corrected = Z_original × scale |
Depth is consistently over/underestimated by a fixed percentage |
disparity_offset |
Subtracts from disparity: d_corrected = d - offset |
Error increases with distance (far objects have larger errors) |
| Observation | Likely Cause | Fix |
|---|---|---|
| All depths too short (underestimate) | Baseline too small | baseline_scale > 1.0 |
| All depths too long (overestimate) | Baseline too large | baseline_scale < 1.0 |
| Near objects accurate, far objects wrong | Disparity offset | Adjust disparity_offset |
| Error ratio (real/est) increases with distance | Combined error | Need both corrections |
- Motion capture system (Vicon, OptiTrack, etc.) OR accurate tape measure
- Flat wall or target at known distances
- Access to Foxglove (recommended) or RViz2 for pointcloud visualization
-
Setup the environment
# Terminal 1: Start the camera driver ros2 launch oak_ffc_4p_driver_ros2 oak_ffc_4p_driver.launch.py # Terminal 2: Start depth estimation (with verbose mode) ros2 launch depth_estimation_ros2 depth_estimation.launch.py # Terminal 3: Open RViz2 rviz2 # Or if using Foxglove (use the correspoding ros domain id and open foxglove app): ROS_DOMAIN_ID=0 ros2 launch foxglove_bridge foxglove_bridge_launch.xml
-
Configure RViz2
- Add PointCloud2 displays for each pair:
/depth/pointcloud/pair_0_1, etc. - Set the fixed frame to match your
pointcloud_frame_id - Use different colors for each pair to distinguish them
If using Foxglove:
- Add 3D panel.
- In
Frameset the display frame to match yourpointcloud_frame_id. - In
Topicsyou can choose the pointclouds to visualize and set different colors.
- Add PointCloud2 displays for each pair:
-
Collect measurements at multiple distances
Position a flat surface (wall) at known distances from the camera. For each distance:
Distance Why ~1.2m Near range reference ~2.5m Mid range ~3.5m Mid-far range ~4.5m+ Far range (most sensitive to disparity offset) -
Record ground truth and estimated depths
For each stereo pair and each distance:
- Ground truth: Use mocap position or tape measure
- Estimated: Read from RViz2 (hover over pointcloud or use "Publish Point" tool). In Foxglove click on the ruler logo on the top right of the 3D panel to measure the exact distant.
Example data format:
Distance | Real (m) | Pair 0_1 | Pair 1_2 | Pair 2_3 | Pair 3_0 ---------|----------|----------|----------|----------|---------- Near | 1.20 | 1.04 | 0.90 | 1.18 | 1.05 Mid | 2.45 | 2.00 | 1.60 | 2.40 | 2.00 Mid-far | 3.61 | 2.76 | 2.06 | 3.61 | 2.79 Far | 4.60 | 3.27 | 2.33 | 4.60 | 3.54 -
Analyze the error pattern
Calculate the ratio
Real / Estimatedfor each measurement:Distance | Pair 0_1 Ratio | Pair 1_2 Ratio | ... ---------|----------------|----------------|---- Near | 1.15 | 1.33 | Mid | 1.23 | 1.53 | Mid-far | 1.31 | 1.75 | Far | 1.41 | 1.97 |- Constant ratio → Only
baseline_scaleneeded (= average ratio) - Increasing ratio with distance → Both corrections needed
- Constant ratio → Only
-
Calculate correction parameters
Simple case (constant ratio):
baseline_scale = average(Real / Estimated) disparity_offset = 0.0Complex case (ratio increases with distance):
The error model in inverse-depth space is linear:
1/Z_est = (1/baseline_scale) × (1/Z_real) + offset_factorFit this linear model to your data points
(1/Z_real, 1/Z_est)to get:baseline_scalefrom the slopedisparity_offsetfrom the intercept (requires knowing focal length)
Practical approximation:
- Start with
baseline_scale = ratio at near distance - Adjust
disparity_offsetuntil far distance is correct - Iterate if needed
-
Create the corrections file
Create
depth_corrections.yamlin your calibration folder:# config/final_maps_256_160/depth_corrections.yaml depth_correction: # Order matches stereo_pairs: ["0_1", "1_2", "2_3", "3_0"] baseline_scales: [1.07, 1.45, 0.92, 1.04] disparity_offsets: [1.0, 0.5, -0.9, 1.0]
-
Verify corrections
Restart the depth estimation node and repeat measurements to verify accuracy.
| Scenario | baseline_scale |
disparity_offset |
|---|---|---|
| Depths too short (underestimate) | > 1.0 | > 0 (if error grows with distance) |
| Depths too long (overestimate) | < 1.0 | < 0 (if error grows with distance) |
| No correction needed | 1.0 | 0.0 |
config/
└── final_maps_256_160/
├── final_map_config_0_1.yaml
├── final_map_config_1_2.yaml
├── final_map_config_2_3.yaml
├── final_map_config_3_0.yaml
├── final_rectified_to_fisheye_map_*.yml
└── depth_corrections.yaml ← Per-drone corrections
If depth_corrections.yaml is missing, the node uses default values (no correction) and logs a warning.
ros2 launch depth_estimation_ros2 depth_estimation.launch.py| Type | Topic | Description |
|---|---|---|
| Sub | /oak_ffc_4p_driver_node/image_raw |
Concatenated 4-camera image |
| Pub | /depth/pointcloud/pair_X_Y |
Stereo pair clouds |
| Pub | /depth/pointcloud/combined |
Full cloud |
| Pub | /depth/debug_grid/compressed |
Debug disparity grid |
- Check that
depth_corrections.yamlis in the correct calibration folder - Verify the YAML syntax (arrays must have exactly 4 elements)
- Check node logs for "Loading depth corrections from:" message
- That pair likely has calibration issues
- Consider re-running stereo calibration for that pair
- Use larger correction values for that specific pair
