Skip to content

[pull] main from inclusionAI:main#15

Merged
pull[bot] merged 2 commits intoaxistore80-coder:mainfrom
inclusionAI:main
Mar 24, 2026
Merged

[pull] main from inclusionAI:main#15
pull[bot] merged 2 commits intoaxistore80-coder:mainfrom
inclusionAI:main

Conversation

@pull
Copy link
Copy Markdown

@pull pull bot commented Mar 24, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

Adiactive and others added 2 commits March 24, 2026 15:58
VLM training (e.g. geometry3k with Qwen2.5-VL) fails during RPC
communication because PIL images and HuggingFace processors
(ProcessorMixin) are not JSON-serializable.

- Add SerializedPILImage: encodes PIL images as base64 PNG for
  rollout submit calls
- Add SerializedProcessor: archives processors via save_pretrained
  into a zip, mirroring the existing SerializedTokenizer pattern
- Wire both into serialize_value() and deserialize_value()
- Add round-trip tests for both types

Fixes: "Object of type JpegImageFile is not JSON serializable"
Fixes: "Object of type Qwen2_5_VLProcessor is not JSON serializable"
…1044)

* refactor(api): migrate allocation_mode to per-engine backend fields

Replace the centralized `allocation_mode` string with explicit `backend`
fields on `TrainEngineConfig` and `InferenceEngineConfig`.  Each engine
now owns its own backend+parallelism spec (e.g. `fsdp:d4`,
`sglang:d4t2`), eliminating implicit auto-backend selection and the
shared `AllocationMode` object.

Key changes:
- Add `backend` field to TrainEngineConfig and InferenceEngineConfig
- Add `ModelAllocation.from_str()` for single-component parsing
- Remove `AllocationMode` public export (replaced by `ModelAllocation`)
- Rename internal `AllocationMode` to `_AllocationMode` for SPMD
  launcher backward compatibility with FutureWarning
- Remove auto-backend selection — explicit backend prefix is now required
- Controllers (`TrainController`, `RolloutController`) parse `backend`
  directly instead of receiving `alloc_mode` from trainers
- `WeightUpdateMeta.alloc_mode` replaced by `gen_allocation`
  (single `ModelAllocation`)
- Add `RWTrainer` and `ArchonRWEngine` for reward model training
- Remove `get_model_update_meta()` helper (logic moved to trainers)
- Update all YAML configs, examples, docs (EN+ZH), and tests

BREAKING CHANGE: `AllocationMode` is removed from public API. Users must
migrate to per-engine `backend` fields. SPMD launchers emit deprecation
warnings.

* chore(ci): fix backend specifier for vlm sft test

* fix: fix bare dims for actor backends

* chore(docs): fix reminder for bare allocation dims
@pull pull bot locked and limited conversation to collaborators Mar 24, 2026
@pull pull bot added the ⤵️ pull label Mar 24, 2026
@pull pull bot merged commit 93b572d into axistore80-coder:main Mar 24, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants