libsimlin: add JSON APIs for model variable retrieval #219

bpowers · 2026-01-28T16:34:05Z

Add three new functions to libsimlin for directly retrieving variable
details as JSON arrays: simlin_model_get_stocks_json,
simlin_model_get_flows_json, and simlin_model_get_auxs_json. These
functions allow callers to get stocks, flows, or auxiliaries without
serializing the entire project.

The new APIs follow the existing libsimlin memory ownership pattern -
the returned buffer is owned by the caller and must be freed with
simlin_free(). This provides a significant simplification for engine2
and pysimlin, which previously had to serialize the entire project to
JSON just to access variable information.

Changes include:

libsimlin: Three new FFI functions with proper documentation and tests
engine2: WASM bindings and Model class simplification using direct APIs
pysimlin: FFI bindings and Model class simplification using direct APIs

The simplification is particularly beneficial for large projects where
serializing the entire project is expensive when only variable info is
needed.

https://claude.ai/code/session_01PujQhTADskFsjJVY3GLuUQ

Add three new functions to libsimlin for directly retrieving variable details as JSON arrays: simlin_model_get_stocks_json, simlin_model_get_flows_json, and simlin_model_get_auxs_json. These functions allow callers to get stocks, flows, or auxiliaries without serializing the entire project. The new APIs follow the existing libsimlin memory ownership pattern - the returned buffer is owned by the caller and must be freed with simlin_free(). This provides a significant simplification for engine2 and pysimlin, which previously had to serialize the entire project to JSON just to access variable information. Changes include: - libsimlin: Three new FFI functions with proper documentation and tests - engine2: WASM bindings and Model class simplification using direct APIs - pysimlin: FFI bindings and Model class simplification using direct APIs The simplification is particularly beneficial for large projects where serializing the entire project is expensive when only variable info is needed. https://claude.ai/code/session_01PujQhTADskFsjJVY3GLuUQ

bpowers · 2026-01-28T16:34:35Z

@codex review

claude · 2026-01-28T16:35:53Z

Pull Request Review: libsimlin JSON APIs for Variable Retrieval

This PR introduces targeted JSON APIs for retrieving stocks, flows, and auxiliaries from models without requiring full project serialization. Overall, this is a well-designed performance optimization with good test coverage. Here are my findings:

✅ Strengths

Clear Performance Win: The motivation is sound - avoiding full project serialization when only variable information is needed will provide significant performance benefits for large projects.
Consistent API Design: All three functions (simlin_model_get_stocks_json, simlin_model_get_flows_json, simlin_model_get_auxs_json) follow identical patterns, making them predictable and maintainable.
Good Memory Ownership Model: The APIs correctly follow the established libsimlin pattern where returned buffers are caller-owned and must be freed with simlin_free().
Comprehensive Test Coverage: The Rust tests cover:
- Happy path (models with variables)
- Empty results (models without the requested variable type)
- Error cases (null model, null output pointers)
- Validation of returned JSON structure and content
Proper Documentation: Both the C header and Rust implementation have clear safety and ownership documentation.
Consistent Implementation Across Platforms: The changes are implemented consistently in engine2 (WASM/TypeScript) and pysimlin (Python), both simplifying their Model classes by eliminating the need for full project serialization.

🔍 Code Quality Observations

Rust Implementation (libsimlin/src/lib.rs)

Potential Issue - Lock Handling:

let project_locked = (*model_ref.project).project.lock().unwrap();

The code uses .unwrap() on a mutex lock, which will panic if the lock is poisoned. According to CLAUDE.md guidelines, .unwrap() is preferred over .unwrap_or_default() to catch assumption violations, but for a mutex lock in a public FFI function, this could crash the process. Consider:

Is this acceptable behavior for the FFI boundary?
Should poisoned locks be handled as errors returned to the caller?

Minor: Code Duplication
The three functions share 95% identical code structure. While this makes each function clear and independent, consider extracting a generic helper if this pattern needs to expand (e.g., for modules or other variable types):

fn get_variables_json<T, F>(
    model: *mut SimlinModel,
    out_buffer: *mut *mut u8,
    out_len: *mut usize,
    out_error: *mut *mut SimlinError,
    filter_map: F,
    error_context: &str,
) where
    F: Fn(&engine::datamodel::Variable) -> Option<T>,
    T: serde::Serialize,

However, given there are only 3 variable types and they're unlikely to expand, the current explicit approach is reasonable.

Memory Safety - Excellent:
The functions correctly:

Validate output pointers before use
Initialize output pointers to null/0 on entry
Use simlin_malloc and properly handle allocation failures
Use std::ptr::copy_nonoverlapping for safe memory copying

TypeScript Implementation (engine2)

Good Use of RAII Pattern:
The try/finally blocks ensure output pointers are always freed:

try {
  fn(model, outBufPtr, outLenPtr, outErrPtr);
  // ... error handling and result processing ...
  return data;
} finally {
  free(outBufPtr);
  free(outLenPtr);
  free(outErrPtr);
}

Proper Memory Management:
The code correctly:

Copies data from WASM using copyFromWasm() before freeing
Frees the buffer pointer returned by the FFI call
Uses TextDecoder for UTF-8 decoding

Minor - Repeated Code:
Similar to the Rust side, the three TypeScript wrapper functions are nearly identical. This is acceptable given:

Clear, simple functions
Type safety per function
Low likelihood of changes

Python Implementation (pysimlin)

Correct CFFI Usage:

output_ptr = ffi.new("uint8_t **")
output_len_ptr = ffi.new("uintptr_t *")
# ...
try:
    return bytes(ffi.buffer(output_ptr[0], output_len_ptr[0]))
finally:
    lib.simlin_free(output_ptr[0])

Properly uses CFFI's buffer protocol and ensures cleanup in finally block.

Good Error Handling:
Uses the existing check_out_error() helper for consistent error translation to Python exceptions.

🎯 Model Class Simplification

The changes successfully simplify the Model classes in both engine2 and pysimlin:

Before:

model = self._get_model_json()  # Serializes entire project
self._cached_stocks = ... for s in model.stocks ...

After:

json_bytes = get_stocks_json(self._ptr)  # Direct variable access
stocks_json = json.loads(json_bytes.decode("utf-8"))

This is cleaner and more efficient, especially for workflows that only query variable information.

🔒 Security

No Security Concerns Identified:

Proper bounds checking on buffer operations
Safe UTF-8 handling
No unsafe casting or pointer arithmetic beyond standard FFI patterns
Memory allocation failures handled correctly

⚡ Performance

Expected Benefits:

Reduced serialization cost: For large projects, serializing only the needed variables (stocks, flows, or auxs) rather than the entire project will significantly reduce CPU and memory usage.
Lower memory pressure: Smaller JSON payloads mean less allocation and GC pressure.
Better cache locality: Operations that only need one variable type won't thrash the cache with unneeded data.

Potential Concern:
Each call still requires:

JSON serialization (via serde_json::to_vec)
UTF-8 decoding
JSON parsing on the client side

For extremely high-frequency calls, a more direct structured API might be faster, but for typical usage patterns this should be fine.

📝 Testing

Test Quality: Excellent

The tests verify:

✅ Correct JSON structure and content
✅ Empty results for models without variables
✅ Null pointer error handling
✅ Memory ownership (tests free returned buffers)

Missing Test Coverage:
Consider adding tests for:

Large models: Verify performance characteristics with models containing hundreds/thousands of variables
Unicode in variable names: Ensure UTF-8 encoding handles non-ASCII characters correctly
Concurrent access: If the model can be accessed from multiple threads, verify thread safety

🎨 Style & Conventions

Commit Message: The commit follows the repository style perfectly:

Component prefix: libsimlin:
Lowercase imperative: add JSON APIs for model variable retrieval
Good body with context and reasoning
Appropriate length

Code Style: All code follows established patterns in the codebase. No style issues.

🤔 Questions/Suggestions

Should these APIs be extended to modules? If users want to query variables by module, would a similar API be useful?
Benchmark results? It would be valuable to add a comment documenting the performance improvement for a representative large model (e.g., "For the World3 model with 500+ variables, this reduces query time from 15ms to 2ms").
API Completeness: Consider if there are other common query patterns that would benefit from similar optimizations (e.g., getting all variables of a specific dimension, or variables matching a name pattern).

✅ Verdict

LGTM with minor suggestions

This PR is well-implemented, properly tested, and provides a clear performance benefit. The code quality is high, follows repository conventions, and introduces no security concerns. The only substantive question is about mutex lock handling in FFI functions - the current behavior (panic on poisoned lock) may be acceptable, but worth confirming.

Recommended before merge:

Confirm the mutex .unwrap() panic behavior is acceptable for FFI
Consider adding a comment documenting expected performance wins
Consider adding tests for Unicode variable names

Nice work! 🚀

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a4bdac1488

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-01-28T16:37:07Z

src/libsimlin/src/lib.rs

+    // Extract stocks from the datamodel and convert to JSON types
+    let stocks: Vec<engine::json::Stock> = datamodel_model
+        .variables
+        .iter()
+        .filter_map(|var| match var {
+            engine::datamodel::Variable::Stock(stock) => {
+                Some(engine::json::Stock::from(stock.clone()))
+            }


Preserve deterministic variable ordering

The new JSON APIs emit variables in the datamodel’s insertion order, but the existing JSON serialization path sorts stocks/flows/auxiliaries by canonical identifier for determinism (json::Model does sort_by_key(canonicalize) in src/simlin-engine/src/json.rs). Since the JS/Python Model.* accessors previously used project serialization, callers implicitly got a stable, canonical order. With this change, the same model can return a different order depending on import/edit history, which can break deterministic outputs and UI/tests that assume sorted lists. Consider sorting the collected vectors before serializing so these APIs match the prior behavior.

Useful? React with 👍 / 👎.

codecov · 2026-01-28T16:41:21Z

Codecov Report

❌ Patch coverage is 71.28713% with 29 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.79%. Comparing base (e3daec4) to head (a4bdac1).
⚠️ Report is 2 commits behind head on main.

Files with missing lines	Patch %	Lines
src/libsimlin/src/lib.rs	71.28%	29 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #219      +/-   ##
==========================================
- Coverage   76.82%   76.79%   -0.03%     
==========================================
  Files          69       69              
  Lines       20523    20624     +101     
==========================================
+ Hits        15766    15839      +73     
- Misses       4757     4785      +28

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

chatgpt-codex-connector bot reviewed Jan 28, 2026

View reviewed changes

bpowers closed this Jan 29, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

libsimlin: add JSON APIs for model variable retrieval #219

libsimlin: add JSON APIs for model variable retrieval #219

Uh oh!

bpowers commented Jan 28, 2026

Uh oh!

bpowers commented Jan 28, 2026

Uh oh!

claude bot commented Jan 28, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Jan 28, 2026

Uh oh!

codecov bot commented Jan 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

libsimlin: add JSON APIs for model variable retrieval #219

libsimlin: add JSON APIs for model variable retrieval #219

Uh oh!

Conversation

bpowers commented Jan 28, 2026

Uh oh!

bpowers commented Jan 28, 2026

Uh oh!

claude bot commented Jan 28, 2026

Pull Request Review: libsimlin JSON APIs for Variable Retrieval

✅ Strengths

🔍 Code Quality Observations

Rust Implementation (libsimlin/src/lib.rs)

TypeScript Implementation (engine2)

Python Implementation (pysimlin)

🎯 Model Class Simplification

🔒 Security

⚡ Performance

📝 Testing

🎨 Style & Conventions

🤔 Questions/Suggestions

✅ Verdict

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Jan 28, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants