-
Notifications
You must be signed in to change notification settings - Fork 36
Add multi-channel support #279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds multi-channel input support to the lightly_train framework by introducing a num_input_channels
parameter throughout the system. This enables training on images with different channel counts beyond the standard 3-channel RGB format.
Key changes include:
- Added
num_input_channels
parameter to all method constructors and model creation functions - Updated transform configurations to support multi-channel normalization and handle incompatible transforms
- Modified DINOv2/DINOv3 models to accept different input channel counts with proper weight loading
- Enhanced datasets to determine image mode based on channel count
Reviewed Changes
Copilot reviewed 57 out of 57 changed files in this pull request and generated 6 comments.
Show a summary per file
File | Description |
---|---|
tests/helpers.py |
Updated test helper methods to include num_input_channels parameter |
tests/_models/test_package_helpers.py |
Added num_input_channels to all model creation tests |
tests/_methods/*/test_*.py |
Updated method instantiation tests with num_input_channels parameter |
tests/_data/test_*.py |
Modified dataset tests to include channel count parameter |
tests/_commands/test_*.py |
Added multi-channel training tests and updated existing tests |
src/lightly_train/_transforms/*.py |
Enhanced transform handling for multi-channel support and auto-resolution |
src/lightly_train/_models/*/ |
Updated model packages to support configurable input channels |
src/lightly_train/_methods/*/ |
Added num_input_channels to all method constructors |
src/lightly_train/_data/*.py |
Modified datasets to handle multi-channel images |
src/lightly_train/_commands/*.py |
Updated training commands to propagate channel count |
Comments suppressed due to low confidence (2)
src/lightly_train/_transforms/semantic_segmentation_transform.py:1
- The type annotation for
random_flip
has been changed fromRandomFlipArgs
toRandomFlipArgs | None
, but it still has a default factory. This creates inconsistency - if it can be None, the default should be None instead of creating an instance. The same issue exists withcolor_jitter
.
#
src/lightly_train/_data/mask_semantic_segmentation_dataset.py:1
- The import
import albumentations as A
has been removed but was likely used elsewhere in the file. Ensure that all references toA
have been properly updated to use the full module name or alternative imports.
#
src/lightly_train/_task_models/dinov2_linear_semantic_segmentation/task_model.py
Show resolved
Hide resolved
src/lightly_train/_task_models/dinov2_eomt_semantic_segmentation/task_model.py
Show resolved
Hide resolved
src/lightly_train/_task_models/dinov2_eomt_semantic_segmentation/transforms.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly good. I left some comments.
The biggest issue is that when num_channels
is auto
with a default normalization in transform_args
the num_channels
is set to 3 and used afterwards.
The suggestions for adding multi-channel to some of the tests are optional. I believe if the high-level tests for .train()
or train_sem_seg()
work then we should be fine.
Btw don't forget the changelog :)
src/lightly_train/_task_models/dinov2_linear_semantic_segmentation/transforms.py
Show resolved
Hide resolved
src/lightly_train/_task_models/dinov3_eomt_semantic_segmentation/transforms.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
What has changed and why?
transform_args={"num_channels": <num channels>}
LIGHTLY_TRAIN_IMAGE_MODE
when loading images with more than 3 channelsMulti-channel support is currently limited to 4 channel images because we use PIL for image loading. We'll add support for TIFF/DICOM with more than 4 channels in a follow-up PR.
The PR looks huge but is mostly passing
num_channels
variables around. I usednum_channels
when the code is related to the image data (e.g. dataset, transforms, etc.). Andnum_input_channels
when the code is in a model context. This is to disambiguate between input and output channels in a model.How has it been tested?
Did you update CHANGELOG.md?
Did you update the documentation?