-
Notifications
You must be signed in to change notification settings - Fork 21
rocAL Deserialize PR 5 : Add python support and examples #436
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
Remove the use of pipeline operator
…ocAL into fg/ser_argument
Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>
…into fg/deser_pr3
Enable user to pass kwargs during pipeline deserialize
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds comprehensive deserialization testing capabilities to rocAL, building on PR #435. It introduces Python bindings for pipeline deserialization, test files for both C++ and Python, and supporting infrastructure for dynamic node reconstruction.
Key Changes:
- Added Python
serialize()anddeserialize()methods to the Pipeline class with file I/O support - Implemented C++ and Python test suites that validate both serialization and deserialization by comparing outputs from original and deserialized pipelines
- Extended deserialization infrastructure with
deserialize_args_from_protobuf()for argument reconstruction, node factory pattern for dynamic node creation, andinitialize_args()methods across loader and augmentation nodes
Reviewed changes
Copilot reviewed 36 out of 36 changed files in this pull request and generated 15 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/python_api/pipeline_serialize_test.py | Python test comparing original and deserialized pipeline outputs |
| tests/cpp_api/serialization_test/serialization_test.cpp | C++ test validating serialization and deserialization with image output comparison |
| tests/cpp_api/serialization_test/README.md | Documentation for serialization test usage and expected behavior |
| tests/cpp_api/serialization_test/CMakeLists.txt | Build configuration for C++ serialization tests |
| tests/cpp_api/CMakeLists.txt | Added CPU and GPU serialization test targets |
| rocAL_pybind/rocal_pybind.cpp | Python bindings for rocalSerialize and rocalDeserialize APIs |
| rocAL_pybind/amd/rocal/pipeline.py | Added serialize/deserialize methods with file I/O and parameter override support |
| rocAL_pybind/amd/rocal/fn.py | Fixed argument ordering in brightness_fixed function |
| rocAL/source/pipeline/pipeline_serializer.cpp | Implements argument deserialization from protobuf with type conversion and validation |
| rocAL/source/pipeline/master_graph.cpp | Reconstructs pipeline graph from deserialized data with tensor and node creation |
| rocAL/source/loaders/image/node_image_loader_single_shard.cpp | Added initialize_args for deserialization and node registration |
| rocAL/source/loaders/image/node_image_loader.cpp | Added initialize_args for deserialization and node registration |
| rocAL/source/augmentations/node_copy.cpp | Added node factory registration |
| rocAL/source/augmentations/color_augmentations/node_brightness.cpp | Added initialize_args with multiple overload support and node registration |
| rocAL/source/api/rocal_api.cpp | Implements rocalDeserialize with protobuf parsing and context creation |
| rocAL/include/pipeline/pipeline_serializer.h | Added deserialize_args_from_protobuf declaration |
| rocAL/include/pipeline/node_factory.h | Factory pattern for dynamic node creation with registration macros |
| rocAL/include/pipeline/node.h | Added virtual initialize_args methods and get_loader_module interface |
| rocAL/include/pipeline/master_graph.h | Added deserialization support with create_operator_output and add_node overload |
| rocAL/include/pipeline/enum_registry.h | Added convertIntToEnum for deserialization with std::any support |
| rocAL/include/pipeline/argument.h | Template get() method for type-safe argument extraction and init_args helper |
| rocAL/include/loaders/* (multiple) | Added override specifiers for get_loader_module methods |
| rocAL/include/augmentations/color_augmentations/node_brightness.h | Added initialize_args declaration |
| rocAL/include/api/rocal_api_types.h | Added RocalPipelineParams struct with std::optional fields |
| rocAL/include/api/rocal_api.h | Added rocalDeserialize API declaration |
| CHANGELOG.md | Updated with serialization test and Python support entries |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def deserialize(cls, serialized_pipeline=None, filename=None, **kwargs): | ||
| """ | ||
| Deserialize the pipeline from the protobuffers and reconstruct the pipeline | ||
| The returned pipeline is deserialized and build |
Copilot
AI
Jan 5, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation says "The returned pipeline is deserialized and build" but it should be "built" (past participle of "build" in this context).
| The returned pipeline is deserialized and build | |
| The returned pipeline is deserialized and built |
| ## Expected Output | ||
|
|
||
| The test will: | ||
| 1. Create the rocAL pipeline | ||
| 2. Display pipeline information (dimensions, augmentation count) | ||
| 3. Serialize the pipeline and show the serialized string size | ||
| 4. Print the complete serialized pipeline string | ||
| 5. Run a few iterations to process images and display image names with labels | ||
| 6. Report successful completion |
Copilot
AI
Jan 5, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The README documentation is outdated and doesn't reflect the actual functionality of the test. The test now includes both serialization and deserialization validation (steps 4-6 in the "Expected Output" section should mention deserialization testing and comparison of outputs between original and deserialized pipelines).
| // Handle scalar types - return the single stored value | ||
| else if (!is_vector) { | ||
| if (values.empty()) { | ||
| THROW("Value not present for the given argument : " + arg_name + ".") |
Copilot
AI
Jan 5, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing semicolon at the end of the THROW statement. This will cause a compilation error.
| THROW("Value not present for the given argument : " + arg_name + ".") | |
| THROW("Value not present for the given argument : " + arg_name + "."); |
| serialized_string = test_serialization(data_path, rocal_cpu, batch_size) | ||
|
|
||
| # Run the deserialization test | ||
| success = test_deserialization(serialized_string, rocal_cpu, batch_size) | ||
|
|
||
| if success: | ||
| print("SERIALIZATION AND DESERIALIZATION TESTS PASSED!") | ||
| else: | ||
| print("SERIALIZATION OR DESERIALIZATION TESTS FAILED") |
Copilot
AI
Jan 5, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test will fail if serialization returns None, but the error handling doesn't exit or prevent the test from attempting deserialization with None. After line 260, if serialized_string is None, the test_deserialization function will receive None and fail, but the error reporting at line 265 will misleadingly report "SERIALIZATION OR DESERIALIZATION TESTS FAILED" when only serialization actually failed.
| if (proto_arg.string_vectors_size() > 1) { | ||
| THROW("Expected at most one string vector for argument " + arg.arg_name + ", but found " + std::to_string(proto_arg.string_vectors_size())); | ||
| } | ||
| const auto& string_vec = proto_arg.string_vectors(0); |
Copilot
AI
Jan 5, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential crash when accessing empty vector. The code checks if the vector size is greater than 1, but doesn't verify that there's at least one vector (size >= 1) before accessing index 0. If string_vectors_size() is 0, accessing string_vectors(0) will cause undefined behavior or a crash.
| void ImageLoaderSingleShardNode::initialize_args(std::vector<Argument> &arguments, std::shared_ptr<MetaDataReader> meta_data_reader) { | ||
| if (arguments.size() != INIT_ARGS_COUNT) | ||
| THROW("ImageLoaderSingleShardNode expected " + std::to_string(INIT_ARGS_COUNT) + " arguments, received " + std::to_string(arguments.size()) + | ||
| "Ensure all arguments present in init are accounted for"); |
Copilot
AI
Jan 5, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message is missing a period or newline separator. It should be "...received X. Ensure all arguments..." or include a newline for clarity.
| "Ensure all arguments present in init are accounted for"); | |
| ". Ensure all arguments present in init are accounted for"); |
| void ImageLoaderNode::initialize_args(std::vector<Argument> &arguments, std::shared_ptr<MetaDataReader> meta_data_reader) { | ||
| if (arguments.size() != INIT_ARGS_COUNT) | ||
| THROW("ImageLoaderNode expected " + std::to_string(INIT_ARGS_COUNT) + " arguments, received " + std::to_string(arguments.size()) + | ||
| "Ensure all arguments present in init are accounted for"); |
Copilot
AI
Jan 5, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message is missing a period or newline separator. It should be "...received X. Ensure all arguments..." or include a newline for clarity.
| "Ensure all arguments present in init are accounted for"); | |
| ". Ensure all arguments present in init are accounted for"); |
| if (proto_arg.int_vectors_size() > 1) { | ||
| THROW("Expected at most one int vector for argument " + arg.arg_name + ", but found " + std::to_string(proto_arg.int_vectors_size())); | ||
| } | ||
| const auto& int_vec = proto_arg.int_vectors(0); |
Copilot
AI
Jan 5, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential crash when accessing empty vector. The code checks if the vector size is greater than 1, but doesn't verify that there's at least one vector (size >= 1) before accessing index 0. If int_vectors_size() is 0, accessing int_vectors(0) will cause undefined behavior or a crash.
| if (proto_arg.float_vectors_size() > 1) { | ||
| THROW("Expected at most one float vector for argument " + arg.arg_name + ", but found " + std::to_string(proto_arg.float_vectors_size())); | ||
| } | ||
| const auto& float_vec = proto_arg.float_vectors(0); |
Copilot
AI
Jan 5, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential crash when accessing empty vector. The code checks if the vector size is greater than 1, but doesn't verify that there's at least one vector (size >= 1) before accessing index 0. If float_vectors_size() is 0, accessing float_vectors(0) will cause undefined behavior or a crash.
| print(preview_text) | ||
| if len(serialized_string) > 500: | ||
| print("... (truncated)") | ||
| except: |
Copilot
AI
Jan 5, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except block directly handles BaseException.
Motivation
Adds deserialization testing capabilities to the rocAL library. Adds python binding for deserialize and python examples for the same.
Technical Details
Test Plan
Test Result
The serialization pipeline must run successfully without any errors.
NOTE: To be merged after PR #435
Submission Checklist