Bring benchmark code back to the latest code #45

XinyuZeng · 2025-11-08T12:45:38Z

Also include a uv.lock for easier reproduction.

This commit adds VLA, LeRobot loaders and a comprehensive benchmarking script to evaluate loading performance across different robotics data formats (VLA, HDF5, RLDS, LeRobot/HuggingFace). The VLA loader includes both shuffled (with multiprocessing) and non-shuffled variants for flexible data loading workflows. Key additions: - VLALoader: Shuffled loader with multiprocessing and prefetch buffer - NonShuffleVLALoader: Sequential loader for deterministic iteration - LeRobotLoader: Support for HuggingFace-format datasets - benchmarks/openx.py: Performance benchmarking across formats - examples: Format conversion utilities (RLDS->VLA, VLA->HDF5) - HDF5Loader: Added split parameter for train/val splits The benchmark script measures loading times, average trajectory sizes, and per-batch performance metrics with configurable batch sizes and format selection. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

XinyuZeng · 2025-11-08T12:46:50Z

benchmarks/openx.py

+        for batch_num, data in enumerate(loader):
+            if batch_num >= self.num_batches:
+                break
+            # self._recursively_load_data(data)


TBH I do not fully understand the reason of this function here, is it just for ensuring the data is correctly loaded (for debugging usage)?

XinyuZeng · 2025-11-08T12:48:33Z

robodm/loader/lerobot.py

+logger = logging.getLogger(__name__)
+
+
+class LeRobotLoader(BaseLoader):


This code is from the mkv branch. Unlike other loaders which includes random shuffle, I think the LeRobotLoader does not includes shuffling. Maybe we should add it?

XinyuZeng · 2025-11-08T12:50:47Z

robodm/loader/hdf5.py

        super(HDF5Loader, self).__init__(path)
-        self.files = glob.glob(self.path, recursive=True)
+
+        # Handle split parameter similar to VLA loader


This is different from the code in mkv branch. For HDF5 and VLA, I assume there is a directory partition for train and test. e.g., ls robodm/vla/nyu_door_opening_surprising_effectiveness/ will get two directories train and test. Similar for HDF5.

XinyuZeng · 2025-11-08T12:52:00Z

uv.lock

We need to check the versions are the ones we want, probably also update pyproject.toml

XinyuZeng and others added 2 commits November 8, 2025 16:17

make vla batch behavior correct

bb11cd1

XinyuZeng commented Nov 8, 2025

View reviewed changes

uv.lock

Copy link

Author

XinyuZeng Nov 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to check the versions are the ones we want, probably also update pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Bring benchmark code back to the latest code #45

Bring benchmark code back to the latest code #45

Uh oh!

XinyuZeng commented Nov 8, 2025

Uh oh!

XinyuZeng Nov 8, 2025

Uh oh!

XinyuZeng Nov 8, 2025

Uh oh!

XinyuZeng Nov 8, 2025 •

edited

Loading

Uh oh!

XinyuZeng Nov 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		logger = logging.getLogger(__name__)


		class LeRobotLoader(BaseLoader):

Uh oh!

Bring benchmark code back to the latest code #45

Are you sure you want to change the base?

Bring benchmark code back to the latest code #45

Uh oh!

Conversation

XinyuZeng commented Nov 8, 2025

Uh oh!

XinyuZeng Nov 8, 2025

Choose a reason for hiding this comment

Uh oh!

XinyuZeng Nov 8, 2025

Choose a reason for hiding this comment

Uh oh!

XinyuZeng Nov 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

XinyuZeng Nov 8, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

XinyuZeng Nov 8, 2025 •

edited

Loading