GitHub - fedor-konovalenko/depth_map

Intro

The main idea of the project is to find the best depth map prediction model for the inference on the mobile device. One of the main restrictions was the model's inference time, so the goal was to apply a kind of convolutional neural network, not visual transformer. The RT Mono Depth model was chosen as a baseline and adopted for the project.

Hypotheses

Compared to the basic model, an additional depth map was extracted from the encoder bottleneck and used as a depth distribution in the image.

Datasets

NYU Depth V2 (25% of the dataset) as a base dataset
ARKit Scenes Dataset as an additional dataset

Training

The model was trained in Google Colab (A100). The standard PyTorch-based training loop was used.

The model training results.

Losses

The combination of Structural Similarity Index (SSIM), L1, and Smooth Loss was used for depth map.
This loss function is based on L1 loss, while SSIM and Smooth losses help the model produce sharper details in the depth map.

$$ \text{Loss: } SSIM + L_1 + L_{\text{smooth}} $$

$$ L = \alpha \cdot L_{\text{SSIM}} + (1 - \alpha) \cdot L_1 + \lambda \cdot L_{\text{smooth}} $$

For depth distribution RMSE Loss was used.

Metrics

For the model quality assessment the following metrics were used

ABSRel
SQRel
RMSE
RMSE Logarithmic

$$AbsRel=\frac{1}{N}\sum \frac{|d_i - \widehat{d_i}|}{d_i}$$

$$SqRel=\frac{1}{N}\sum \frac{|d_i - \widehat{d_i}|^2}{d_i}$$

$$RMSE = \sqrt{\frac{1}{N}\sum |d_i - \widehat{d_i}|^2}$$

$$RMSE_{log} = \sqrt{\frac{1}{N}\sum |log_{10}d_i - log_{10}\widehat{d_i}|^2}$$

Results

As a result, a model was trained and tested that provides the required inference time on a mobile device and has better quality compared to the base model. And smoothing the predicted depth maps using the moving average method, as shown in the notebook, allowed us to obtain higher-quality depth maps, which will allow us to use the model in applications using augmented reality.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
lib		lib
notebooks		notebooks
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Intro

Hypotheses

Datasets

Training

Losses

Metrics

Results

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Intro

Hypotheses

Datasets

Training

Losses

Metrics

Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages