Skip to content

DOC: fix the description of filter_files #823

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 13, 2025

Conversation

mslw
Copy link
Contributor

@mslw mslw commented May 8, 2025

I might be confused, but the docs say "Any files where this function returns True will be filtered out." and I am convinced that it is the opposite.

The filter_files is used in the reproin heuristic (where it always returns True), with this docstring: "Return True if a file should be kept, else False". This commit adjusts the docs to effectively state the same (it used to say the opposite).

The docstring in the heudiconv heuristic was right: filter_files is fed to dicoms.py::group_dicoms_into_seqinfos() as file_filter, where it is applied with built-in filter() to the to-be-processed file list; filter() yields items for which function(item) == True.

N.B. This change makes it clear that filter_files and filter_dicom have opposite logic. The latter is fed to
dicoms.py::group_dicoms_into_seqinfos() as dcmfilter; dcmfilter is passed into dicoms.py::validate_dicom(); validate_dicom returns None (meaning: not valid) if dcmfilter(mw.dcm_data) is True.

So for a file to be kept, filter_files must return True, and filter_dicom must return False. This commit changes the documentation rather than the logic, assuming that the logic has been present long enough to warrant preservation (it is also easier to only change the docs).

The filter_files is used in the reproin heuristic (where it always
returns True), with this docstring: "Return True if a file should be
kept, else False". This commit adjusts the docs to effectively state
the same (it used to say the opposite).

The docstring in the heudiconv heuristic was right: filter_files is
fed to `dicoms.py::group_dicoms_into_seqinfos()` as `file_filter`,
where it is applied with built-in `filter()` to the to-be-processed
file list; `filter()` yields items for which `function(item) == True`.

N.B. This change makes it clear that `filter_files` and `filter_dicom`
have opposite logic. The latter is fed to
`dicoms.py::group_dicoms_into_seqinfos()` as `dcmfilter`; `dcmfilter`
is passed into `dicoms.py::validate_dicom()`; validate_dicom returns
`None` (meaning: *not valid*) if `dcmfilter(mw.dcm_data)` is `True`.

So for a file to be kept, `filter_files` must return `True`, and
`filter_dicom` must return `False`. This commit changes the
documentation rather than the logic, assuming that the logic has been
present long enough to warrant preservation (it is also easier to only
change the docs).
Copy link

codecov bot commented May 8, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.55%. Comparing base (1d420e5) to head (7cb7e5e).
Report is 3 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #823   +/-   ##
=======================================
  Coverage   82.55%   82.55%           
=======================================
  Files          42       42           
  Lines        4340     4340           
=======================================
  Hits         3583     3583           
  Misses        757      757           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR corrects the documentation for the filter_files function to accurately reflect its filtering behavior.

  • Updated description in docs/heuristics.rst to state that files where filter_files returns False are filtered out.
  • Clarified the opposite behavior between filter_files and filter_dicom.
Comments suppressed due to low confidence (1)

docs/heuristics.rst:86

  • [nitpick] Consider rephrasing for clarity and consistency, e.g., 'Only files for which the function returns True are kept, and all others are filtered out.'
files where this function returns ``False`` will be filtered out (files where this function returns ``True`` will be kept).

Copy link
Member

@yarikoptic yarikoptic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again! For this PR we are all set, just let's decide on extra note.

But going forward -

interested to take on it @mslw ?

@yarikoptic yarikoptic added the documentation Changes only affect the documentation label May 8, 2025
@mslw
Copy link
Contributor Author

mslw commented May 9, 2025

But going forward -

* [add handling of `filter_dicoms` to parallel `filter_files` and deprecated `filter_dicom` #824](https://github.com/nipy/heudiconv/issues/824)

interested to take on it @mslw ?

Thanks! I like the idea of making them both consistent, but for now I have to pass - too much happening at once. Maybe the next time I'm running heudiconv, unless somebody else tackles that before.

Co-authored-by: Michał Szczepanik <[email protected]>
@yarikoptic yarikoptic merged commit 2a15cdf into nipy:master May 13, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Changes only affect the documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants