Skip to content

Latest commit

 

History

History
776 lines (572 loc) · 33.4 KB

File metadata and controls

776 lines (572 loc) · 33.4 KB

AGENTS.md

This file provides comprehensive guidance to AI coding agents when working with the Apache Fory codebase.

Core Principles

While working on Fory, please remember:

  • Do not reserve any legacy code/docs unless requested clearly.
  • Performance First: Performance is the top priority. Never introduce code that reduces performance without explicit justification.
  • English Only: Always use English in code, comments, and documentation.
  • Meaningful Comments: Only add comments when the code's behavior is difficult to understand or when documenting complex algorithms.
  • Focused Testing: Only add tests that verify internal behaviors or fix specific bugs; don't create unnecessary tests unless requested.
  • Git-Tracked Files: When reading code, skip all files not tracked by git by default unless generated by yourself.
  • Cross-Language Consistency: Maintain consistency across language implementations while respecting language-specific idioms.
  • GraalVM support using fory codegen: For GraalVM, use fory codegen to generate the serializer when building a native image. Do not use GraalVM reflect-related configuration unless for JDK proxy.
  • Xlang Type System: Java native mode(xlang=false) shares same type systems between type id from Types.BOOL~Types.STRING with xlang mode(xlang=true), but for other types, java native mode has different type ids.
  • Remote git repository: git@github.com:apache/fory.git is remote repository, do not use other remote repository when you want to check code under main branch, apache/main is the only target main branch instead of origin/main
  • Refresh remote main before compare: before any diff/review/compare against apache/main, always run git fetch apache main first so comparisons use the latest remote main.
  • Contributor git repository: A contributor should fork the git@github.com:apache/fory.git repo, and git push the code changes into their forked repo, then create a pull request from the branch in their forked repo into git@github.com:apache/fory.git.
  • Debug Test Errors: always set environment variable ENABLE_FORY_DEBUG_OUTPUT to 1 to see debug output.

Documentation Sources and Change Rules

  • Primary references: README.md, CONTRIBUTING.md, docs/guide/DEVELOPMENT.md, and language guides under docs/guide/.
  • Protocol changes: Read and update the relevant specs in docs/specification/** and align cross-language tests.
  • Docs publishing: Updates under docs/guide/ and docs/benchmarks/ are synced to https://github.com/apache/fory-site; other website content should be changed in that repo.
  • Benchmark docs refresh is mandatory: When any benchmark logic/script/config or compared serializer set changes, rerun the relevant benchmarks and refresh corresponding artifacts under docs/benchmarks/** (report + plots) before finalizing.
  • Debugging docs: C++ debugging guidance lives in docs/cpp_debug.md.
  • Conflicts: If instructions conflict, follow the most specific module docs and call out the conflict in your response.

Build and Development Commands

Java Development

  • All maven commands must be executed within the java directory.
  • All changes to java must pass the code style check and tests.
  • Fory java needs JDK 17+ installed.
  • Modules target different bytecode levels (fory-core Java 8, fory-format Java 11); avoid using newer APIs in those modules.
  • Use '.*' form of import is not allowed.
  • If you run temporary tests using java -cp, you must run mvn -T16 install -DskipTests to get latest jars for fory java library.
# Clean the build
mvn -T16 clean

# Build
mvn -T16 package

# Install
mvn -T16 install -DskipTests

# Code format check
mvn -T16 spotless:check

# Code format
mvn -T16 spotless:apply

# Code style check
mvn -T16 checkstyle:check

# Run tests
mvn -T16 test

# Run specific tests
mvn -T16 test -Dtest=org.apache.fory.TestClass#testMethod

C# Development

  • All dotnet commands must be executed within the csharp directory.
  • All changes to csharp must pass formatting and tests.
  • Fory C# requires .NET SDK 8.0+ and C# 12+.
  • Use dotnet format to keep C# code style consistent.
# Restore
dotnet restore Fory.sln

# Build
dotnet build Fory.sln -c Release --no-restore

# Run tests
dotnet test Fory.sln -c Release

# Run specific test
dotnet test tests/Fory.Tests/Fory.Tests.csproj -c Release --filter "FullyQualifiedName~ForyRuntimeTests.DynamicObjectReadDepthExceededThrows"

# Format code
dotnet format Fory.sln

# Format check
dotnet format Fory.sln --verify-no-changes

Run C# xlang tests:

cd java
mvn -T16 install -DskipTests
cd fory-core
FORY_CSHARP_JAVA_CI=1 ENABLE_FORY_DEBUG_OUTPUT=1 mvn -T16 test -Dtest=org.apache.fory.xlang.CSharpXlangTest

C++ Development

  • All commands must be executed within the cpp directory.
  • Fory c++ use c++ 17, you must not use features from higher version of C++.
  • Bazel uses bzlmod (MODULE.bazel); prefer Bazel 8+.
  • For Bazel C++ tests, detect machine architecture and only add --config=x86_64 on x86_64/amd64; on arm64/aarch64, do not enable this config.
  • When you updated the code, use clang-format to update the code
  • When invoking a method that returns Result, always use FORY_TRY unless in a control flow context.
  • Wrap error checks with FORY_PREDICT_FALSE for branch prediction optimization.
  • Continue on error for trivial errors; only return early for critical errors like buffer overflow.
  • private methods should be put last in class def, before private fields.
# Build C++ library
bazel build //cpp/...

# Build Cython extensions (replace X.Y with your Python version, e.g., 3.10)
bazel build //:cp_fory_so --@rules_python//python/config_settings:python_version=X.Y

# Run tests
bazel test $(bazel query //cpp/...)

# Run serialization tests
bazel test $(bazel query //cpp/fory/serialization/...)

# Run specific test
bazel test //cpp/fory/util:buffer_test

# format c++ code
clang-format -i $file

Run C++ xlang tests:

cd java
mvn -T16 install -DskipTests
cd fory-core
FORY_CPP_JAVA_CI=1 ENABLE_FORY_DEBUG_OUTPUT=1 mvn -T16 test -Dtest=org.apache.fory.xlang.CPPXlangTest

Python Development

  • All commands must be executed within the python directory.
  • All changes to python must pass the code style check and tests.
  • When running tests, you can use the ENABLE_FORY_CYTHON_SERIALIZATION environment variable to enable or disable cython serialization.
  • When debugging protocol related issues, you should use ENABLE_FORY_CYTHON_SERIALIZATION=0 first to verify the behavior.
  • Fory python needs cpython 3.8+ installed although some modules such as fory-core use java8.
# clean build
rm -rf build dist .pytest_cache
bazel clean --expunge

# Code format
ruff format .
ruff check --fix .

# Install
pip install -v -e .

# Build native extension when cython code changed (replace X.Y with your Python version)
bazel build //:cp_fory_so --@rules_python//python/config_settings:python_version=X.Y --config=x86_64 # For x86_64
bazel build //:cp_fory_so --@rules_python//python/config_settings:python_version=X.Y --copt=-fsigned-char # For arm64 and aarch64

# Run tests without cython
ENABLE_FORY_CYTHON_SERIALIZATION=0 pytest -v -s .
# Run tests with cython
ENABLE_FORY_CYTHON_SERIALIZATION=1 pytest -v -s .

Run Python xlang tests:

cd java
mvn -T16 install -DskipTests
cd fory-core
# disable fory cython for faster debugging
FORY_PYTHON_JAVA_CI=1 ENABLE_FORY_CYTHON_SERIALIZATION=0 ENABLE_FORY_DEBUG_OUTPUT=1 mvn -T16 test -Dtest=org.apache.fory.xlang.PythonXlangTest
# enable fory cython
FORY_PYTHON_JAVA_CI=1 ENABLE_FORY_CYTHON_SERIALIZATION=1 ENABLE_FORY_DEBUG_OUTPUT=1 mvn -T16 test -Dtest=org.apache.fory.xlang.PythonXlangTest

Golang Development

  • All commands must be executed within the go/fory directory.
  • All changes to go must pass the format check and tests.
  • Go implementation focuses on reflection-based and codegen-based serialization.
  • Set FORY_PANIC_ON_ERROR=1 when debugging test errors to see full callstack.
  • You must not set FORY_PANIC_ON_ERROR=1 when running all go tests to check whether all tests pass, some tests will check Error content, which will fail if error just panic.
# Format code
go fmt ./...

# Run tests
go test -v ./...

# Run tests with race detection
go test -race -v ./...

# Build
go build

# Generate code (if using go:generate)
go generate ./...

Run Go xlang tests:

cd java
mvn -T16 install -DskipTests
cd fory-core
FORY_GO_JAVA_CI=1 ENABLE_FORY_DEBUG_OUTPUT=1 mvn test -Dtest=org.apache.fory.xlang.GoXlangTest

Rust Development

  • All cargo commands must be executed within the rust directory.
  • All changes to rust must pass the clippy check and tests.
  • You must set RUST_BACKTRACE=1 FORY_PANIC_ON_ERROR=1 when debugging rust tests to get backtrace.
  • You must add -- --nocapture to cargo test command when debugging tests.
  • You must not set FORY_PANIC_ON_ERROR=1 when running all rust tests to check whether all tests pass, some tests will check Error content, which will fail if error just panic.
# Check code
cargo check

# Build
cargo build

# Run linter for all services.
cargo clippy --all-targets --all-features -- -D warnings

# Run tests (requires test features)
cargo test --features tests

# run specific test
cargo test -p tests  --test $test_file $test_method

# run specific test under subdirectory
cargo test --test mod $dir$::$test_file::$test_method

# debug specific test under subdirectory and get backtrace
RUST_BACKTRACE=1 FORY_PANIC_ON_ERROR=1 ENABLE_FORY_DEBUG_OUTPUT=1 cargo test --test mod $dir$::$test_file::$test_method -- --nocapture

# inspect generated code by fory derive macro
cargo expand --test mod $mod$::$file$ > expanded.rs

# Format code
cargo fmt

# Check formatting
cargo fmt --check

# Build documentation
cargo doc --lib --no-deps --all-features

# Run benchmarks
cd $project_dir/benchmarks/rust
cargo bench

Run Rust xlang tests:

cd java
mvn -T16 install -DskipTests
cd fory-core
RUST_BACKTRACE=1 FORY_PANIC_ON_ERROR=1 FORY_RUST_JAVA_CI=1 ENABLE_FORY_DEBUG_OUTPUT=1 mvn test -Dtest=org.apache.fory.xlang.RustXlangTest

Swift Development

  • All commands must be executed within the swift directory.
  • All changes to swift must pass lint and tests.
  • Swift lint uses swift/.swiftlint.yml.
  • Use ENABLE_FORY_DEBUG_OUTPUT=1 when debugging Swift tests.
# Build package
swift build

# Run tests
swift test

# Run tests with debug output
ENABLE_FORY_DEBUG_OUTPUT=1 swift test

# Lint check
swiftlint lint --config .swiftlint.yml

# Auto-fix lint issues where supported
swiftlint --fix --config .swiftlint.yml

Run Swift xlang tests:

cd swift
swift build -c release --disable-automatic-resolution --product ForyXlangTests
cd ../java
mvn -T16 install -DskipTests
cd fory-core
FORY_SWIFT_JAVA_CI=1 ENABLE_FORY_DEBUG_OUTPUT=1 mvn -T16 test -Dtest=org.apache.fory.xlang.SwiftXlangTest

JavaScript/TypeScript Development

  • All commands must be executed within the javascript directory.
  • Uses npm/yarn for package management.
# Install dependencies
npm install

# Run tests
node ./node_modules/.bin/jest --ci --reporters=default --reporters=jest-junit

# Format code
git ls-files -- '*.ts' | xargs -P 5 node ./node_modules/.bin/eslint

Dart Development

  • All commands must be executed within the dart directory.
  • Uses pub for package management.
# First, generate necessary code
dart run build_runner build

# Run all tests
dart test

# Format code
dart analyze
dart fix --dry-run
dart fix --apply

Kotlin Development

  • All maven commands must be executed within the kotlin directory.
  • Kotlin implementation provides extra serializers for kotlin types.
  • Kotlin implementation is built on fory java, please install the java libraries first by cd ../java && mvn -T16 install -DskipTests. If no code changes after installed fory java, you can skip the installation step.
# Build
mvn clean package

# Run tests
mvn test

Scala Development

  • All commands must be executed within the scala directory.
  • Scala implementation provides extra serializers for Scala types.
  • Scala implementation is built on fory java, please install the java libraries first by cd ../java && mvn -T16 install -DskipTests. If no code changes after installed fory java, you can skip the installation step.
# Build with sbt
sbt compile

# Run tests
sbt test

# Format code
sbt scalafmt

Integration Tests

  • All commands must be executed within the integration_tests directory.
  • For java related integration tests, please install the java libraries first by cd ../java && mvn -T16 install -DskipTests. If no code changes after installed fory java, you can skip the installation step.
  • For mac, graalvm is installed at /Library/Java/JavaVirtualMachines/graalvm-xxx by default.
  • For integration_tests/idl_tests (mandatory prerequisites):
    • Always run cd ../java && mvn -T16 install -DskipTests before any idl_tests run if there were changes under java/ since the last install. If unsure, run it.
    • Always run cd ../python && pip install -v -e . (and rebuild Cython if needed) before any idl_tests run if there were changes to Cython-related code under python/. If unsure, run it.
  • You are never allowed to manual edit generated code by fory compiler for IDL files, you must invoke fory compiler to regenerate code.
it_dir=$(pwd)
# Run graalvm tests
cd $it_dir/graalvm_tests && mvn -T16 -DskipTests=true -Pnative package && target/main

# Run JDK compatibility tests
cd $it_dir/jdk_compatibility_tests && mvn -T16 test

# Run JPMS tests
cd $it_dir/jpms_tests && mvn -T16 test

# Run Python benchmarks
cd $it_dir/cpython_benchmark && pip install -r requirements.txt && python benchmark.py

Documentation and Formatting

  • Markdown Formatting: When updating markdown documentation, use prettier --write $file to format.
  • API Documentation: When updating important public APIs, update documentation under docs/.
  • Protocol Specifications: docs/specification/** contains Fory protocol specifications. Read these documents carefully before making protocol changes.
  • User Guides: docs/guide/** contains user guides for different features and languages.
  • Repo-wide formatting: bash ci/format.sh --all runs format/lint across languages.

Repository Structure Understanding

Git Repository

Apache Fory is an open-source project hosted on GitHub. The git repository for Apache Fory is https://github.com/apache/fory . Contributors always fork the repository and create a pull request to propose changes. The origin points to forked repository instead of the official repository.

Key Directories

  • docs/: Documentation, specifications, and guides

    • docs/specification/: Protocol specifications (critical for understanding)
    • docs/guide/: User guides and development guides
    • docs/benchmarks/: Performance benchmarks documentation
  • benchmarks/: Benchmark suites and harnesses

  • examples/: Usage examples and sample code

  • compiler/: Code generation and compiler-related utilities

  • Language Implementations:

    • java/: Java implementation (maven-based, multi-module)
    • csharp/: C# implementation (.NET SDK + source generator)
    • python/: Python implementation (pip/setuptools + bazel)
    • cpp/: C++ implementation (bazel-based)
    • go/: Go implementation (go modules)
    • rust/: Rust implementation (cargo-based)
    • javascript/: JavaScript/TypeScript implementation (npm-based)
    • dart/: Dart implementation (pub-based)
    • kotlin/: Kotlin implementation (maven-based)
    • scala/: Scala implementation (sbt-based)
  • Testing and CI:

    • integration_tests/: Cross-language integration tests
    • .github/workflows/: GitHub Actions CI/CD workflows
    • ci/: CI scripts and configurations
  • Build Configuration:

    • BUILD, WORKSPACE: Bazel configuration
    • .bazelrc, .bazelversion: Bazel settings
    • MODULE.bazel: Bazel bzlmod dependency management
    • Various pom.xml, package.json, Cargo.toml, etc.
  • licenses/: Third-party license reports and metadata

Important Files

  • AGENTS.md: This file - AI coding guidance
  • CLAUDE.md: Claude Code specific instructions
  • CONTRIBUTING.md: Contribution guidelines
  • README.md: Project overview and quick start
  • .gitignore: Git ignore patterns (includes build dirs)
  • licenserc.toml: License header configuration
  • docs/guide/DEVELOPMENT.md: Environment setup and build notes
  • docs/cpp_debug.md: C++ debugging guide

Architecture Overview

Apache Fory is a blazingly-fast multi-language serialization framework that revolutionizes data exchange between systems and languages. By leveraging JIT compilation, code generation and zero-copy techniques, Fory delivers up to 170x faster performance compared to other serialization frameworks while being extremely easy to use.

Binary Protocols

Fory uses binary protocols for efficient serialization and deserialization. Fory designed and implemented multiple binary protocols for different scenarios:

  • xlang serialization format:
    • Cross-language serialize any object automatically, no need for IDL definition, schema compilation and object to/from protocol conversion.
    • Support optional shared reference and circular reference, no duplicate data or recursion error.
    • Support object polymorphism.
  • Row format: A cache-friendly binary random access format, supports skipping serialization and partial serialization, and can convert to column-format automatically.
  • Java serialization format: Highly-optimized and drop-in replacement for Java serialization.
  • Python serialization format: Highly-optimized and drop-in replacement for Python pickle, which is an extension built upon xlang serialization format.

**docs/specification/** are the specification for the Fory protocol, please read those documents carefully and think hard and make sure you understand them before making changes to code and documentation.

Compiler Development (FDL/IDL)

  • Primary references: docs/compiler/index.md, docs/compiler/compiler-guide.md, docs/compiler/schema-idl.md, docs/compiler/type-system.md, docs/compiler/generated-code.md, docs/compiler/protobuf-idl.md, docs/compiler/flatbuffers-idl.md.
  • Location: compiler/ contains the Fory compiler, parser, IR, and code generators.
  • Install & CLI:
    • cd compiler && pip install -e .
    • foryc --help
    • foryc schema.fdl --lang <langs> --output <dir>
  • Generated code: Never edit generated files manually; update the .fdl/.proto/.fbs and re-run the compiler.
  • IDL tests: Use integration_tests/idl_tests/generate_idl.py for test codegen.
    • In integration_tests/idl_tests, keep package names aligned with the IDL filename.
  • Protocol changes: Update docs/specification/** and cross-language tests.
  • Language targets: FDL is the primary schema; protobuf/flatbuffers support should remain unchanged unless explicitly requested.

Core Runtime Structure

Fory serialization for every language is implemented independently to minimize the object memory layout interoperability, object allocation, memory access cost, thus maximize the performance. There is no code reuse between languages except for fory python, which reused code from fory c++.

Java

  • fory-core: Java library implementing the core object graph serialization

    • java/fory-core/src/main/java/org/apache/fory/Fory.java: main serialization entry point
    • java/fory-core/src/main/java/org/apache/fory/resolver/TypeResolver.java: type resolution and serializer dispatch
    • java/fory-core/src/main/java/org/apache/fory/resolver/RefResolver.java: class for resolving shared/circular references when ref tracking is enabled
    • java/fory-core/src/main/java/org/apache/fory/serializer: serializers for each supported type
    • java/fory-core/src/main/java/org/apache/fory/codegen: code generators, provide expression abstraction and compile expression tree to java code and byte code
    • java/fory-core/src/main/java/org/apache/fory/builder: build expression tree for serialization to generate serialization code
    • java/fory-core/src/main/java/org/apache/fory/reflect: reflection utilities
    • java/fory-core/src/main/java/org/apache/fory/type: java generics and type inference utilities
    • java/fory-core/src/main/java/org/apache/fory/util: utility classes
  • fory-format: Java library implementing the core row format encoding and decoding

    • java/fory-format/src/main/java/org/apache/fory/format/row: row format data structures
    • java/fory-format/src/main/java/org/apache/fory/format/encoder: generate row format encoder and decoder to encode/decode objects to/from row format
    • java/fory-format/src/main/java/org/apache/fory/format/type: type inference for row format
    • java/fory-format/src/main/java/org/apache/fory/format/vectorized: interoperation with apache arrow columnar format
  • fory-extensions: extension libraries for java, including:

    • Protobuf serializers for fory java native object graph protocol.
    • Meta compression based on zstd
  • fory-simd: SIMD-accelerated serialization and deserialization based on java vector API

    • java/fory-simd/src/main/java/org/apache/fory/util: SIMD utilities
    • java/fory-simd/src/main/java/org/apache/fory/serializer: SIMD accelerated serializers
  • fory-test-core: Core test utilities and data generators

  • testsuite: Complex test suite for issues reported by users and hard to reproduce using simple test cases

  • benchmark: Benchmark suite based on jmh

Bazel

bazel dir provides build support for fory C++ and Cython:

  • bazel/cython_library.bzl: pyx_library rule for building Cython extensions

Dependencies are managed via MODULE.bazel using bzlmod (Bazel 8+).

C++

  • cpp/fory/row: Row format data structures
  • cpp/fory/meta: Compile-time reflection utilities for extract struct fields information.
  • cpp/fory/encoder: Row format encoder and decoder
  • cpp/fory/util: Common utilities
    • cpp/fory/util/buffer.h: Buffer for reading and writing data
    • cpp/fory/util/bit_util.h: utilities for bit manipulation
    • cpp/fory/util/string_util.h: String utilities
    • cpp/fory/util/status.h: Status code for error handling

Python

Fory python has two implementations for the protocol:

  • Python mode: Pure python implementation based on xlang serialization format, used for debugging and testing only. This mode can be enabled by setting ENABLE_FORY_CYTHON_SERIALIZATION=0 environment variable.
  • Cython mode: Cython based implementation based on xlang serialization format, which is used by default and has better performance than pure python. This mode can be enabled by setting ENABLE_FORY_CYTHON_SERIALIZATION=1 environment variable.
  • Python mode and Cython mode reused some code from each other to reduce code duplication.
  • Debug Struct Serialization: set ENABLE_FORY_DEBUG_OUTPUT=1 to enable detailed struct serialization/deserialization logs while debugging protocol behavior.

Code structure:

  • python/pyfory/serialization.pyx: Core serialization logic and entry point for cython mode based on xlang serialization format
  • python/pyfory/_fory.py: Serialization entry point for pure python mode based on xlang serialization format
  • python/pyfory/registry.py: Type registry, resolution and serializer dispatch for pure python mode, which is also used by cython mode. Cython mode use a cache to reduce invocations to this module.
  • python/pyfory/serializer.py: Serializers for non-internal types
  • python/pyfory/includes: Cython headers for c++ functions and classes.
  • python/pyfory/resolver.py: resolving shared/circular references when ref tracking is enabled in pure python mode
  • python/pyfory/format: Fory row format encoding and decoding, arrow columnar format interoperation
  • python/pyfory/buffer.pyx: Buffer for reading/writing data, string utilities. Used by serialization.pyx and python/pyfory/format at the same time.

Go

Fory go provides reflection-based and codegen-based serialization and deserialization.

  • go/fory/fory.go: serialization entry point
  • go/fory/resolver.go: resolving shared/circular references when ref tracking is enabled
  • go/fory/type.go: type system and type resolution, serializer dispatch
  • go/fory/slice.go: serializers for slice type
  • go/fory/map.go: serializers for map type
  • go/fory/set.go: serializers for set type
  • go/fory/struct.go: serializers for struct type
  • go/fory/string.go: serializers for string type
  • go/fory/buffer.go: Buffer for reading/writing data
  • go/fory/codegen: code generators, provide code generator to be invoked by go:generate to generate serialization code to speed up the serialization.
  • go/fory/meta: Meta string compression

Rust

Fory rust provides macro-based serialization and deserialization. Fory rust consists of:

  • fory: Main library entry point
    • rust/fory/src/lib.rs: main library entry point to export API to users
  • fory-core: Core library for serialization and deserialization
    • rust/fory-core/src/fory.rs: main serialization entry point
    • rust/fory-core/src/resolver/type_resolver.rs: type resolution and registration
    • rust/fory-core/src/resolver/metastring_resolver.rs: resolver for meta string
    • rust/fory-core/src/resolver/context.rs: context for reading/writing
    • rust/fory-core/src/buffer.rs: buffer for reading/writing data
    • rust/fory-core/src/meta: meta string compression, type meta encoding
    • rust/fory-core/src/serializer: serializers for each supported type
    • rust/fory-core/src/row: row format encoding and decoding
  • fory-derive: Rust macro-based codegen for serialization and deserialization
    • rust/fory-derive/src/object: macro for serializing/deserializing structs
    • rust/fory-derive/src/fory_row: macro for encoding/decoding row format

Integration Tests

integration_tests contains integration tests with following modules:

  • cpython_benchmark: benchmark suite for fory python
  • graalvm_tests: test suite for fory java on graalvm.
    • Note that fory use codegen to support graalvm instead of reflection, fory don't use reflect-config.json for serialization, this is the core advantage of compared to graalvm JDK serialization.
  • jdk_compatibility_tests: test suite for fory serialization compatibility between multiple JDK versions

Key Development Guidelines

Performance Guidelines

  • Performance First: Never introduce code that reduces performance without explicit justification
  • Benchmark Required After Perf Optimizations: For every code change expected to improve performance, run the relevant benchmark immediately after applying the change and report the measured results (command + before/after numbers) in your response/PR.
  • Performance Trace Log: For every performance-optimization round, append the hypothesis, code change, benchmark command, before/after numbers, and keep/revert decision to tasks/perf_optimization_rounds.md before moving to the next round.
  • Zero-Copy: Leverage zero-copy techniques when possible
  • JIT Compilation: Consider JIT compilation opportunities
  • Memory Layout: Optimize for cache-friendly memory access patterns

Code Quality

  • Public APIs: Must be well-documented and easy to understand
  • Error Handling: Implement comprehensive error handling with meaningful messages
  • Type Safety: Use strong typing and generics appropriately
  • Null Safety: Handle null values appropriately for each language

Cross-Language Considerations

  • Protocol Compatibility: Ensure serialization compatibility across languages
  • Type Mapping: Understand type mapping between languages (see docs/specification/xlang_type_mapping.md)
  • Endianness: Handle byte order correctly for cross-platform compatibility
  • Version Compatibility: Maintain backward compatibility when possible

Security and Compatibility

  • Class registration: Keep class registration enabled unless explicitly requested; use custom class checkers or policies if disabling.
  • Schema evolution: Prefer schema-consistent mode unless compatibility is required; update compatibility tests when changing schema rules.

Testing Strategy

  • Unit Tests: Focus on internal behavior verification
  • Integration Tests: Use integration_tests/ for cross-language compatibility
  • Language alignment and protocol compatibility: Run org.apache.fory.xlang.CPPXlangTest, org.apache.fory.xlang.CSharpXlangTest, org.apache.fory.xlang.RustXlangTest, org.apache.fory.xlang.GoXlangTest, and org.apache.fory.xlang.PythonXlangTest when changing xlang or type mapping behavior
  • Performance Tests: Include benchmarks for performance-critical changes

Documentation Requirements

  • API Changes: Update relevant documentation in docs/
  • Protocol Changes: Update specifications in docs/specification/
  • Examples: Provide working examples for new features
  • Migration Guides: Document breaking changes and migration paths

Development Workflow

Before Making Changes

  1. Read Specifications: Review relevant docs in docs/specification/
  2. Understand Architecture: Study the language-specific implementation structure
  3. Check Existing Tests: Look at existing test patterns and coverage
  4. Review Related Issues: Check GitHub issues for context

Making Changes

  1. Follow Language Conventions: Respect each language's idioms and patterns
  2. Maintain Performance: Profile performance-critical changes
  3. Add Tests: Include appropriate tests for new functionality
  4. Update Documentation: Update docs for API changes
  5. Format Code: Use language-specific formatters before committing

Debugging Guidelines

Protocol Issues

  • Use Python Mode: Set ENABLE_FORY_CYTHON_SERIALIZATION=0 for debugging
  • Check Specifications: Refer to protocol specs in docs/specification/
  • Cross-Language Testing: Use integration tests to verify compatibility

Performance Issues

  • Profile First: Use appropriate profilers for each language
  • Memory Analysis: Check for memory leaks and allocation patterns

Build Issues

  • Clean Builds: Use language-specific clean commands
  • Dependency Issues: Check version compatibility
  • Bazel Issues: Use bazel clean --expunge for deep cleaning

Language-Specific Debugging

  • Java: Set FORY_CODE_DIR to dump generated code; ENABLE_FORY_GENERATED_CLASS_UNIQUE_ID=false keeps stable generated class names.
  • Python: Use cython --cplus -a pyfory/serialization.pyx for annotated output; FORY_DEBUG=true python setup.py build_ext --inplace for debug builds.
  • C++: See docs/cpp_debug.md; generate compile_commands.json with bazel run :refresh_compile_commands.
  • Crash debugging: For macOS core dump setup, follow CONTRIBUTING.md.

Profiling

  • C++: DTrace-based stack sampling is documented in CONTRIBUTING.md.

IDE Notes

  • IntelliJ IDEA: Java modules target different bytecode levels (Java 8/11). Use a JDK 11+ project SDK and disable --release if it blocks sun.misc.Unsafe access (see CONTRIBUTING.md).

CI/CD Understanding

GitHub Actions Workflows

  • ci.yml: Main CI workflow for all languages
  • build-native-*.yml: Mac/Window python wheel build workflows
  • build-containerized-*.yml: Containerized python wheel build workflows for linux
  • lint.yml: Code formatting and linting
  • pr-lint.yml: PR-specific checks

Fixing GitHub CI Errors

Use the GitHub CLI (gh) to inspect and fix CI failures:

# List all checks for a PR and their status
gh pr checks <PR_NUMBER> --repo apache/fory

# View failed job logs (get job ID from pr checks output)
gh run view <RUN_ID> --repo apache/fory --job <JOB_ID> --log-failed

# View full job logs
gh run view <RUN_ID> --repo apache/fory --job <JOB_ID> --log

# Example workflow for fixing CI errors:
# 1. List checks to find failing jobs
gh pr checks 2942 --repo apache/fory

# 2. Get the failed job logs (RUN_ID and JOB_ID from step 1)
gh run view 19735911308 --repo apache/fory --job 56547673283 --log-failed

# 3. Fix the issues based on error messages
# 4. Commit and push fixes

Common CI failures and fixes:

  • Code Style Check: Run formatters (clang-format, prettier, spotless:apply, etc.)
  • Markdown Lint: Run prettier --write <file> for markdown files
  • C++ Build Errors: Check for missing dependencies or header includes
  • Test Failures: Run tests locally to reproduce and fix

PR and Benchmark Expectations

  • PR titles: Follow Conventional Commits; CI uses .github/workflows/pr-lint.yml to enforce naming.
  • Performance changes: Use the perf type and include benchmark data (see benchmarks/java/README.md).

Commit Message Format

Use conventional commits with language scope:

feat(java): add codegen support for xlang serialization
fix(rust): fix collection header when collection is empty
docs(python): add docs for xlang serialization
refactor(java): unify serialization exceptions hierarchy
perf(cpp): optimize buffer allocation in encoder
test(integration): add cross-language reference cycle tests
ci: update build matrix for latest JDK versions
chore(deps): update guava dependency to 32.0.0