feat: allow specifying actuator in csv sample store #455

michael-johnston · 2026-01-22T15:46:23Z

Adds ability to specify an actuator when importing data from a CSV. This allows data obtained from elsewhere (maybe from ado) to appear as ado data (can be used as memoized data in a space with that experiment)

The actuator/experiment combination will be validated against existing actuators and if passed imported.

Notes:

Changes the schema of CSVSampleStoreDescriptor to simplify the code - upgrade path and warnings provided.
column name are no longer converted to lower case automatically. User must do this in YAML if desired.

…ption

DRL-NextGen · 2026-01-22T16:02:17Z

Checks Summary

Last run: 2026-01-26T20:34:03.488Z

Code Risk Analyzer vulnerability scan found 2 vulnerabilities:

Severity	Identifier	Package	Details	Fix
◻ Unknown	CVE-2025-53000	nbconvert	nbconvert has an uncontrolled search path that leads to unauthorized code execution on Windows GHSA-xm59-rqc7-hhvf nbconvert:7.16.6->ado-core:1.3.3	>7.16.6
◻ Unknown	CVE-2026-0994	protobuf	protobuf affected by a JSON recursion depth bypass GHSA-7gcm-g887-7qv7 protobuf:6.33.4->ado-core:1.3.3,protobuf:6.33.4,vllm:0.14.1	>6.33.4

Mend Unified Agent vulnerability scan found 3 vulnerabilities:

Severity	Identifier	Package	Details	Fix
❗ Critical	CVE-2025-56005	ply-3.11-py2.py3-none-any.whl	An undocumented and unsafe feature in the PLY (Python Lex-Yacc) library 3.11 allows Remote Code Exec... An undocumented and unsafe feature in the PLY (Python Lex-Yacc) library 3.11 allows Remote Code Execution (RCE) via the "picklefile" parameter in the "yacc()" function. This parameter accepts a ".pkl" file that is deserialized with "pickle.load()" without validation. Because "pickle" allows execution of embedded code via "reduce()", an attacker can achieve code execution by passing a malicious pickle file. The parameter is not mentioned in official documentation or the GitHub repository, yet it is active in the PyPI version. This introduces a stealthy backdoor and persistence risk.	Not Available
🔺 High	CVE-2025-53000	nbconvert-7.16.6-py3-none-any.whl	The nbconvert tool, jupyter nbconvert, converts Jupyter notebooks to various other formats via Jinja... The nbconvert tool, jupyter nbconvert, converts Jupyter notebooks to various other formats via Jinja templates. Versions of nbconvert up to and including 7.16.6 on Windows have a vulnerability in which converting a notebook containing SVG output to a PDF results in unauthorized code execution. Specifically, a third party can create a "inkscape.bat" file that defines a Windows batch script, capable of arbitrary code execution. When a user runs "jupyter nbconvert --to pdf" on a notebook containing SVG output to a PDF on a Windows platform from this directory, the "inkscape.bat" file is run unexpectedly. As of time of publication, no known patches exist.	Not Available
🔺 High	CVE-2026-0994	protobuf-6.33.4-cp39-abi3-manylinux2014_x86_64.whl	A denial-of-service (DoS) vulnerability exists in google.protobuf.json_format.ParseDict() in Python,... A denial-of-service (DoS) vulnerability exists in google.protobuf.json_format.ParseDict() in Python, where the max_recursion_depth limit can be bypassed when parsing nested google.protobuf.Any messages. Due to missing recursion depth accounting inside the internal Any-handling logic, an attacker can supply deeply nested Any structures that bypass the intended recursion limit, eventually exhausting Python’s recursion stack and causing a RecursionError.	Not Available

orchestrator/core/samplestore/base.py

orchestrator/core/samplestore/csv.py

website/docs/resources/sample-stores.md

Co-authored-by: Alessandro Pomponio <10339005+AlessandroPomponio@users.noreply.github.com> Signed-off-by: Michael Johnston <66301584+michael-johnston@users.noreply.github.com>

- Moved all logic related experiments into ExperimentDescriptors - included constiutive properties - Two subclasses once for External (Replay) experiments one for Internal - This greatly simplifies CSVSampleStoreDescriptor - This also simplifies CSVSampleStore - Provide a warning and upgrade path for old formats.

Signed-off-by: Michael Johnston <66301584+michael-johnston@users.noreply.github.com>

SampleStoreReference parameters should be a dict but tests assumed it was a pydantic model

…aj_csv_source_extension

Code updated. Rereview required

michael-johnston · 2026-01-26T11:00:27Z

@AlessandroPomponio What is the process to resolve the vulnerability scan issues? I can't see anything reported in pipeline.

orchestrator/core/samplestore/base.py

website/docs/resources/sample-stores.md

AlessandroPomponio · 2026-01-26T13:22:40Z

website/docs/resources/sample-stores.md

+The key here is that you **must** define which columns in the CSV are observed properties
+and which are constitutive properties.
+If you want to use the column names directly as observed/constitutive property names
+you can pass a list to the relevant field.
+If you want to define new observed/constitutive property names for each column you
+can pass a dictionary.


This is what AI suggest rewriting this to to make it clearer. I like this one better with the example next to the explanation. Feel free to edit.

NOTE: the code block renders incorrectly. Please click on EDIT for this comment to see the real suggestion

You must specify which CSV columns contain observed properties (measurements/results) and which contain constitutive properties (input parameters/configurations). **Two ways to map columns:** 1. **Use CSV column names as-is** - Pass a list: ```yaml constitutivePropertyMap: - cpu_value - memory_gb

This uses cpu_value and memory_gb as both the column names AND property names.

Rename columns - Pass a dictionary:
observedPropertyMap: wallClockRuntime: 'wall-clock runtime' throughput: 'requests_per_sec'
This reads from CSV columns wall-clock runtime and requests_per_sec,
but names them wallClockRuntime and throughput in the experiment.

website/docs/resources/sample-stores.md

AlessandroPomponio · 2026-01-26T13:27:38Z

orchestrator/core/samplestore/csv.py

+            idColumn: Column containing entity identifiers
+            generatorIdentifier: Optional identifier for the entity generator
+            experimentIdentifier: Experiment identifier
+            actuatorIdentifier: Actuator identifier (defaults to 'replay')


This defaults to None in the signature

Co-authored-by: Alessandro Pomponio <10339005+AlessandroPomponio@users.noreply.github.com> Signed-off-by: Michael Johnston <66301584+michael-johnston@users.noreply.github.com>

michael-johnston added 8 commits January 22, 2026 13:11

feat(core): allow passing actuatorIdentifier to ExperimentDescription

6ed3705

feat(core): validate experiment and actuators in CSVSampleStoreDescri…

7beeb94

…ption

feat(core): enable specifying actuator and validation in from_csv

ae057f9

fix(core): skip experiment validation if actuator is replay

1fd6c7d

test(samplestore): testing actuator field

952feb8

Merge remote-tracking branch 'origin/main' into maj_csv_source_extension

36148d8

chore(core): use Defaultable

67d09ca

docs(website): importing data

756f0da

AlessandroPomponio previously requested changes Jan 22, 2026

View reviewed changes

Apply suggestions from code review

0b42c7e

Co-authored-by: Alessandro Pomponio <10339005+AlessandroPomponio@users.noreply.github.com> Signed-off-by: Michael Johnston <66301584+michael-johnston@users.noreply.github.com>

michael-johnston marked this pull request as draft January 23, 2026 11:02

michael-johnston and others added 8 commits January 23, 2026 14:39

refactor(core): Update example

ed49c22

chore(docs): remove old doc

3fe7474

docs(core): Add info on exception

849a035

chore(core): No need to get global registry inside registry instance

aee09c6

docs(website): update

c70f6c7

chore(core): field descriptions

b922010

Merge branch 'main' into maj_csv_source_extension

3bf1ca3

Signed-off-by: Michael Johnston <66301584+michael-johnston@users.noreply.github.com>

michael-johnston requested a review from AlessandroPomponio January 23, 2026 15:18

michael-johnston marked this pull request as ready for review January 23, 2026 15:19

michael-johnston added 8 commits January 26, 2026 09:47

chore(tests): remove stray field

88778c5

refactor(core): Update function

c028482

refactor(core): Update function

8ee2505

chore(core): additional check

44d516e

chore(tests): Update fixutre init

d8c487a

fix(tests): Assuming model instead of dict

e64e978

SampleStoreReference parameters should be a dict but tests assumed it was a pydantic model

Merge remote-tracking branch 'origin/maj_csv_source_extension' into m…

d308b8d

…aj_csv_source_extension

test(samplestore): csv sample store validation

8435ff4

chore(black): formatting

4d9cb73

AlessandroPomponio requested changes Jan 26, 2026

View reviewed changes

Apply suggestions from code review

a03580a

Co-authored-by: Alessandro Pomponio <10339005+AlessandroPomponio@users.noreply.github.com> Signed-off-by: Michael Johnston <66301584+michael-johnston@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: allow specifying actuator in csv sample store #455

feat: allow specifying actuator in csv sample store #455

Uh oh!

michael-johnston commented Jan 22, 2026 •

edited

Loading

Uh oh!

DRL-NextGen commented Jan 22, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

michael-johnston commented Jan 26, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

AlessandroPomponio Jan 26, 2026

Uh oh!

Uh oh!

AlessandroPomponio Jan 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat: allow specifying actuator in csv sample store #455

Are you sure you want to change the base?

feat: allow specifying actuator in csv sample store #455

Uh oh!

Conversation

michael-johnston commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DRL-NextGen commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

michael-johnston commented Jan 26, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

AlessandroPomponio Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

AlessandroPomponio Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

michael-johnston commented Jan 22, 2026 •

edited

Loading

DRL-NextGen commented Jan 22, 2026 •

edited

Loading