-
Notifications
You must be signed in to change notification settings - Fork 424
Open
Description
Currently, mmengine.fileio.list_dir_or_file
doesn't support glob pattern matching when listing files. While Python's built-in glob.glob
exists, it only works with local filesystem and cannot be used with other storage backends.
Proposed Solution
Add two new API functions:
def glob(pattern, *, recursive=False, backend_args=None):
"""Return a list of paths matching a pathname pattern.
"""
pass
def iglob(pattern, *, recursive=False, backend_args=None):
"""Return an iterator yielding paths matching a pathname pattern.
"""
pass
Example usage:
from mmengine.fileio import glob
# List all jpg files in a directory
files = glob('s3://bucket/path/*.jpg', backend_args={'access_key': '...'})
# Recursively find all .png files
files = glob('local/path/**/*.png', recursive=True)
Current workaround requires manual filtering:
from mmengine.fileio import list_dir_or_file
import fnmatch
files = list_dir_or_file('s3://path/', list_dir=False)
jpg_files = [f for f in files if fnmatch.fnmatch(f, '*.jpg')]
Having a backend-agnostic glob implementation would:
- Provide consistent pattern matching across different storage backends
- Simplify file filtering without manual pattern matching
- Match the functionality users expect from standard file operations
- Improve code readability when working with specific file patterns
Would appreciate feedback on this proposal. Thank you!
Metadata
Metadata
Assignees
Labels
No labels