Skip to content

Conversation

@fiona-gladwin
Copy link
Contributor

Motivation

Adds deserialization testing capabilities to the rocAL library. Adds python binding for deserialize and python examples for the same.

Technical Details

  • Extends serialization test to verify both original and deserialized pipeline execution with comparative outputs
  • Introduce python deserialize API, and extend python serialization example to execute deserialize and compare outputs

Test Plan

  • Run serialization tests as mentioned in the README.md
  • The tests are also added as part of the make test

Test Result

The serialization pipeline must run successfully without any errors.

NOTE: To be merged after PR #435

Submission Checklist

fiona-gladwin and others added 30 commits October 7, 2025 04:48
Remove the use of pipeline operator
Co-authored-by: spolifroni-amd <Sandra.Polifroni@amd.com>
spolifroni-amd
spolifroni-amd previously approved these changes Jan 5, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds comprehensive deserialization testing capabilities to rocAL, building on PR #435. It introduces Python bindings for pipeline deserialization, test files for both C++ and Python, and supporting infrastructure for dynamic node reconstruction.

Key Changes:

  • Added Python serialize() and deserialize() methods to the Pipeline class with file I/O support
  • Implemented C++ and Python test suites that validate both serialization and deserialization by comparing outputs from original and deserialized pipelines
  • Extended deserialization infrastructure with deserialize_args_from_protobuf() for argument reconstruction, node factory pattern for dynamic node creation, and initialize_args() methods across loader and augmentation nodes

Reviewed changes

Copilot reviewed 36 out of 36 changed files in this pull request and generated 15 comments.

Show a summary per file
File Description
tests/python_api/pipeline_serialize_test.py Python test comparing original and deserialized pipeline outputs
tests/cpp_api/serialization_test/serialization_test.cpp C++ test validating serialization and deserialization with image output comparison
tests/cpp_api/serialization_test/README.md Documentation for serialization test usage and expected behavior
tests/cpp_api/serialization_test/CMakeLists.txt Build configuration for C++ serialization tests
tests/cpp_api/CMakeLists.txt Added CPU and GPU serialization test targets
rocAL_pybind/rocal_pybind.cpp Python bindings for rocalSerialize and rocalDeserialize APIs
rocAL_pybind/amd/rocal/pipeline.py Added serialize/deserialize methods with file I/O and parameter override support
rocAL_pybind/amd/rocal/fn.py Fixed argument ordering in brightness_fixed function
rocAL/source/pipeline/pipeline_serializer.cpp Implements argument deserialization from protobuf with type conversion and validation
rocAL/source/pipeline/master_graph.cpp Reconstructs pipeline graph from deserialized data with tensor and node creation
rocAL/source/loaders/image/node_image_loader_single_shard.cpp Added initialize_args for deserialization and node registration
rocAL/source/loaders/image/node_image_loader.cpp Added initialize_args for deserialization and node registration
rocAL/source/augmentations/node_copy.cpp Added node factory registration
rocAL/source/augmentations/color_augmentations/node_brightness.cpp Added initialize_args with multiple overload support and node registration
rocAL/source/api/rocal_api.cpp Implements rocalDeserialize with protobuf parsing and context creation
rocAL/include/pipeline/pipeline_serializer.h Added deserialize_args_from_protobuf declaration
rocAL/include/pipeline/node_factory.h Factory pattern for dynamic node creation with registration macros
rocAL/include/pipeline/node.h Added virtual initialize_args methods and get_loader_module interface
rocAL/include/pipeline/master_graph.h Added deserialization support with create_operator_output and add_node overload
rocAL/include/pipeline/enum_registry.h Added convertIntToEnum for deserialization with std::any support
rocAL/include/pipeline/argument.h Template get() method for type-safe argument extraction and init_args helper
rocAL/include/loaders/* (multiple) Added override specifiers for get_loader_module methods
rocAL/include/augmentations/color_augmentations/node_brightness.h Added initialize_args declaration
rocAL/include/api/rocal_api_types.h Added RocalPipelineParams struct with std::optional fields
rocAL/include/api/rocal_api.h Added rocalDeserialize API declaration
CHANGELOG.md Updated with serialization test and Python support entries

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

def deserialize(cls, serialized_pipeline=None, filename=None, **kwargs):
"""
Deserialize the pipeline from the protobuffers and reconstruct the pipeline
The returned pipeline is deserialized and build
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation says "The returned pipeline is deserialized and build" but it should be "built" (past participle of "build" in this context).

Suggested change
The returned pipeline is deserialized and build
The returned pipeline is deserialized and built

Copilot uses AI. Check for mistakes.
Comment on lines 44 to 52
## Expected Output

The test will:
1. Create the rocAL pipeline
2. Display pipeline information (dimensions, augmentation count)
3. Serialize the pipeline and show the serialized string size
4. Print the complete serialized pipeline string
5. Run a few iterations to process images and display image names with labels
6. Report successful completion
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The README documentation is outdated and doesn't reflect the actual functionality of the test. The test now includes both serialization and deserialization validation (steps 4-6 in the "Expected Output" section should mention deserialization testing and comparison of outputs between original and deserialized pipelines).

Copilot uses AI. Check for mistakes.
// Handle scalar types - return the single stored value
else if (!is_vector) {
if (values.empty()) {
THROW("Value not present for the given argument : " + arg_name + ".")
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing semicolon at the end of the THROW statement. This will cause a compilation error.

Suggested change
THROW("Value not present for the given argument : " + arg_name + ".")
THROW("Value not present for the given argument : " + arg_name + ".");

Copilot uses AI. Check for mistakes.
Comment on lines 257 to 265
serialized_string = test_serialization(data_path, rocal_cpu, batch_size)

# Run the deserialization test
success = test_deserialization(serialized_string, rocal_cpu, batch_size)

if success:
print("SERIALIZATION AND DESERIALIZATION TESTS PASSED!")
else:
print("SERIALIZATION OR DESERIALIZATION TESTS FAILED")
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test will fail if serialization returns None, but the error handling doesn't exit or prevent the test from attempting deserialization with None. After line 260, if serialized_string is None, the test_deserialization function will receive None and fail, but the error reporting at line 265 will misleadingly report "SERIALIZATION OR DESERIALIZATION TESTS FAILED" when only serialization actually failed.

Copilot uses AI. Check for mistakes.
Comment on lines +373 to +376
if (proto_arg.string_vectors_size() > 1) {
THROW("Expected at most one string vector for argument " + arg.arg_name + ", but found " + std::to_string(proto_arg.string_vectors_size()));
}
const auto& string_vec = proto_arg.string_vectors(0);
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential crash when accessing empty vector. The code checks if the vector size is greater than 1, but doesn't verify that there's at least one vector (size >= 1) before accessing index 0. If string_vectors_size() is 0, accessing string_vectors(0) will cause undefined behavior or a crash.

Copilot uses AI. Check for mistakes.
void ImageLoaderSingleShardNode::initialize_args(std::vector<Argument> &arguments, std::shared_ptr<MetaDataReader> meta_data_reader) {
if (arguments.size() != INIT_ARGS_COUNT)
THROW("ImageLoaderSingleShardNode expected " + std::to_string(INIT_ARGS_COUNT) + " arguments, received " + std::to_string(arguments.size()) +
"Ensure all arguments present in init are accounted for");
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message is missing a period or newline separator. It should be "...received X. Ensure all arguments..." or include a newline for clarity.

Suggested change
"Ensure all arguments present in init are accounted for");
". Ensure all arguments present in init are accounted for");

Copilot uses AI. Check for mistakes.
void ImageLoaderNode::initialize_args(std::vector<Argument> &arguments, std::shared_ptr<MetaDataReader> meta_data_reader) {
if (arguments.size() != INIT_ARGS_COUNT)
THROW("ImageLoaderNode expected " + std::to_string(INIT_ARGS_COUNT) + " arguments, received " + std::to_string(arguments.size()) +
"Ensure all arguments present in init are accounted for");
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message is missing a period or newline separator. It should be "...received X. Ensure all arguments..." or include a newline for clarity.

Suggested change
"Ensure all arguments present in init are accounted for");
". Ensure all arguments present in init are accounted for");

Copilot uses AI. Check for mistakes.
Comment on lines +347 to +350
if (proto_arg.int_vectors_size() > 1) {
THROW("Expected at most one int vector for argument " + arg.arg_name + ", but found " + std::to_string(proto_arg.int_vectors_size()));
}
const auto& int_vec = proto_arg.int_vectors(0);
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential crash when accessing empty vector. The code checks if the vector size is greater than 1, but doesn't verify that there's at least one vector (size >= 1) before accessing index 0. If int_vectors_size() is 0, accessing int_vectors(0) will cause undefined behavior or a crash.

Copilot uses AI. Check for mistakes.
Comment on lines +364 to +367
if (proto_arg.float_vectors_size() > 1) {
THROW("Expected at most one float vector for argument " + arg.arg_name + ", but found " + std::to_string(proto_arg.float_vectors_size()));
}
const auto& float_vec = proto_arg.float_vectors(0);
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential crash when accessing empty vector. The code checks if the vector size is greater than 1, but doesn't verify that there's at least one vector (size >= 1) before accessing index 0. If float_vectors_size() is 0, accessing float_vectors(0) will cause undefined behavior or a crash.

Copilot uses AI. Check for mistakes.
print(preview_text)
if len(serialized_string) > 500:
print("... (truncated)")
except:
Copy link

Copilot AI Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except block directly handles BaseException.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants