-
Notifications
You must be signed in to change notification settings - Fork 3
Consistency check script #114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
CIGbalance
wants to merge
4
commits into
main
Choose a base branch
from
feat/ga_consistency
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,31 @@ | ||
| # OPL YAML utils | ||
|
|
||
| This folder contains utility scripts for working with the YAML format to describe problems in context of OPL. They are mainly intended to be run automatically via GitHub Actions to make collaboration easier. | ||
|
|
||
| The intended way of adding a new problem to the repository is thus as follows: | ||
|
|
||
| * Change the [new_problem.yaml](new_problem.yaml) template file to fit the new problem. | ||
| * Create a PR with the changes (for example with a fork). | ||
|
|
||
| What happens in the background then is: | ||
|
|
||
| * On PR creation and commits to the PR, the [validate_yaml.py](validate_yaml.py) script is run to check that the YAML file is valid and consistent. It is expecting the changes to be in the [new_problem.yaml](new_problem.yaml) file. | ||
| * Then the PR should be reviewed manually. | ||
| * When the PR is merged into the main branch, a second script runs (which doesn't exist yet), that adds the content of [new_problem.yaml](new_problem.yaml) to the [problems.yaml](../problems.yaml) file, and reverts the changes to the new_problem.yaml. | ||
|
|
||
| :alert: Note that the GitHubActions do not exist yet either, this is a WIP. | ||
|
|
||
| ## validate_yaml.py | ||
|
|
||
| This script checks the new content for the following: | ||
|
|
||
| * The YAML syntax is valid and is in expected format | ||
| * The required fields are present. | ||
| * Specific fields are unique across the new set of problems (e.g. name) | ||
|
|
||
| :alert: Execute from root of the repository. Tested with python 3.12 | ||
|
|
||
| ```bash | ||
| pip install -r utils/requirements.txt | ||
| python utils/validate_yaml.py utils/new_problem.yaml | ||
| ``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| - name: template | ||
| suite/generator/single: suite | ||
| objectives: '1' | ||
| dimensionality: scalable | ||
| variable type: continuous | ||
| constraints: 'no' | ||
| dynamic: 'no' | ||
| noise: 'no' | ||
| multimodal: 'yes' | ||
| multi-fidelity: 'no' | ||
| reference: '' | ||
| implementation: '' | ||
| source (real-world/artificial): '' | ||
| textual description: 'This is a dummy template' |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,2 @@ | ||
| pyyaml | ||
| pandas |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,136 @@ | ||
| import yaml | ||
|
|
||
| import sys | ||
| from pathlib import Path | ||
|
|
||
| # Add parent directory to sys.path | ||
| parent = Path(__file__).resolve().parent.parent | ||
| sys.path.insert(0, str(parent)) | ||
|
|
||
| # Now you can import normally | ||
| from yaml_to_html import default_columns as REQUIRED_FIELDS | ||
|
|
||
| OPTIONAL_FIELDS = ["multimodal"] | ||
| UNIQUE_FIELDS = ["name"] | ||
| NON_EMPTY_FIELDS = ["name"] | ||
| UNIQUE_WARNING_FIELDS = ["reference", "implementation"] | ||
| PROBLEMS_FILE = "problems.yaml" | ||
|
|
||
|
|
||
| def read_data(filepath): | ||
| try: | ||
| with open(filepath, "r") as f: | ||
| data = yaml.safe_load(f) | ||
| return 0, data | ||
| except FileNotFoundError: | ||
| print(f"::error::File not found: {filepath}") | ||
| return 1, None | ||
| except yaml.YAMLError as e: | ||
| print(f"::error::YAML syntax error: {e}") | ||
| return 1, None | ||
|
|
||
|
|
||
| def check_format(data): | ||
| num_problems = len(data) | ||
| if len(data) < 1: | ||
| print("::error::YAML file should contain at least one top level entry.") | ||
| return False | ||
| print(f"::notice::YAML file contains {num_problems} top-level entries.") | ||
| unique_fields = [] | ||
| for i, entry in enumerate(data): | ||
| if not isinstance(entry, dict): | ||
| print(f"::error::Entry {i} is not a dictionary.") | ||
| return False | ||
| unique_fields.append({k: v for k, v in entry.items() if k in UNIQUE_FIELDS}) | ||
| for k in UNIQUE_FIELDS: | ||
| values = [entry[k] for entry in unique_fields] | ||
| if len(values) != len(set(values)): | ||
| print(f"::error::Field '{k}' must be unique across all entries.") | ||
| return False | ||
| return True | ||
|
|
||
|
|
||
| def check_fields(data): | ||
| missing = [field for field in REQUIRED_FIELDS if field not in data] | ||
| if missing: | ||
| print(f"::error::Missing required fields: {', '.join(missing)}") | ||
| return False | ||
| new_fields = [ | ||
| field for field in data if field not in REQUIRED_FIELDS + OPTIONAL_FIELDS | ||
| ] | ||
| if new_fields: | ||
| print(f"::warning::New field added: {', '.join(new_fields)}") | ||
| # Check that the name is not still template | ||
| if data.get("name") == "template": | ||
| print( | ||
| "::error::Please change the 'name' field from 'template' to a unique name." | ||
| ) | ||
| return False | ||
| # Check non-empty fields | ||
| empty_fields = [ | ||
| field | ||
| for field in NON_EMPTY_FIELDS | ||
| if data.get(field, None) is None or data.get(field, "").strip() == "" | ||
| ] | ||
| if empty_fields: | ||
| print( | ||
| f"::error::The following fields cannot be empty: {', '.join(empty_fields)}" | ||
| ) | ||
| return False | ||
| return True | ||
|
|
||
|
|
||
| def check_novelty(data): | ||
| # Load existing problems | ||
| read_status, existing_data = read_data(PROBLEMS_FILE) | ||
| if read_status != 0: | ||
| print("::error::Could not read existing problems for novelty check.") | ||
| return False | ||
| assert existing_data is not None | ||
| for field in UNIQUE_FIELDS + UNIQUE_WARNING_FIELDS: | ||
| # skip empty fields | ||
| if not data.get(field): | ||
| continue | ||
| existing_values = { | ||
| entry.get(field) for entry in existing_data if isinstance(entry, dict) | ||
| } | ||
| if data.get(field) in existing_values: | ||
| if field in UNIQUE_WARNING_FIELDS: | ||
| print( | ||
| f"::warning::Field '{field}' with value '{data.get(field)}' already exists. Consider choosing a unique value." | ||
| ) | ||
| continue | ||
| elif field in UNIQUE_FIELDS: | ||
| print( | ||
| f"::error::Field '{field}' with value '{data.get(field)}' already exists. Please choose a unique value." | ||
| ) | ||
| return False | ||
| return True | ||
|
|
||
|
|
||
| def validate_yaml(filepath): | ||
| status, data = read_data(filepath) | ||
| if status != 0: | ||
| sys.exit(1) | ||
| if not check_format(data): | ||
| sys.exit(1) | ||
| assert data is not None | ||
|
|
||
| for i, new_data in enumerate(data): # Iterate through each top-level entry | ||
| # Check required and unique fields | ||
| if not check_fields(new_data) or not check_novelty(new_data): | ||
| print(f"::error::Validation failed for entry {i+1}.") | ||
| sys.exit(1) | ||
|
|
||
| # YAML is valid if we reach this point | ||
| print("YAML syntax is valid.") | ||
| sys.exit(0) | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| if len(sys.argv) < 2: | ||
| print("::error::Usage: python validate_yaml.py <yourfile.yaml>") | ||
| sys.exit(1) | ||
|
|
||
| filepath = sys.argv[1] | ||
| validate_yaml(filepath) |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe not a now issue, but when the new template is ready, there may be
REQUIRED_FIELDSthat are subfields to other fields. E.g., something like this:These would not pass the current checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thanks for pointing that out, I didn't think of that!