GitHub - bruniss/dinovol: 3d adaptation of dinov2 pretraining for larger than memory ome-zarr arrays

an attempt at a faithful implementation of dinov2-style pretraining on 3d volumes.

the dinov2_eva is from dynamic-network-architectures , with some minimal changes
the augmentation library is a loosely modified batchgeneratorsv2
normalization is mostly borrowed from nnunetv2
rope is from the dinov3 impl, extended to support 3d

this implementation is still incomplete. pretraining works but no finetuning yet written.

NOTE: a newer v2 backbone config exists and should generally be preferred for new runs, but the default remains the older config so older checkpoints continue to load without config changes To select the newer defaults explicitly, set model.model_type to v2 in the config:

{
  "model": {
    "model_type": "v2",
    "embedding_type": "default",
    "global_crops_size": [96, 96, 96],
    "local_crops_size": [48, 48, 48]
  }
}

Optional Task Eval During Pretraining

pretrain.py can optionally run small downstream segmentation trainings during pretraining.

set task_eval_every to a positive step cadence to enable it
choose eval_task as both, surfaces, or ink
set eval_task_train_iters to control the mini-training length, default 500
set eval_task_decoder_type to simple or patch_encode_decode

The task data is downloaded with python -m dinovol_2.eval.download_data --task both.

both now means surfaces plus ink
surfaces is resized 2x before crops are drawn
surfaces and ink each use the first 10 sorted samples as the deterministic validation set
ink is not resized before crops are drawn
train and validation crops are taken from precomputed chunks that contain some foreground and at least 50% background in supervised voxels
the saved validation image contains one row per validation sample, with image / label / prediction panels
for ink, voxels with supervision_mask == 0 are ignored and supervised unlabeled voxels are treated as background
for ink, loss/metrics and saved previews use a max projection across Z to match the flat ink trainer

Napari visualizer

There is a small napari helper for checkpoint inspection at dinovol_2/eval/napari_visualizer.py.

Run it with:

python -m dinovol_2.eval.napari_visualizer

Workflow:

open an OME-Zarr from the widget, click Load Scales, choose the desired scale, and click Open Zarr
draw a rectangle in the generated *_bbox shapes layer; this 2D YX bbox is applied across the full Z span of the selected scale
add one or more points in a Points layer
choose a pretrain.py checkpoint, image layer, and points layer in the dock widget
click Cache Embeddings
click Show Feature PCA to render a 3-channel PCA view of the cached patch embeddings
optionally enable Otsu Foreground Mask and set Mask Dilation before creating the PCA layer
click Similarity For Selected Points or Similarity For All Points

The widget rebuilds the teacher backbone from the saved checkpoint config, computes a patch embedding grid only inside the active bbox for the selected OME-Zarr scale, and limits the PCA and cosine-similarity outputs to that same crop. The dock widget opens on the bottom of the napari window.

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
dinovol_2		dinovol_2
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Optional Task Eval During Pretraining

Napari visualizer

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Optional Task Eval During Pretraining

Napari visualizer

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages