Skip to content

Refactor merge.py and add tests for it#165

Open
naved001 wants to merge 5 commits intoCCI-MOC:mainfrom
naved001:refactor-merge-py
Open

Refactor merge.py and add tests for it#165
naved001 wants to merge 5 commits intoCCI-MOC:mainfrom
naved001:refactor-merge-py

Conversation

@naved001
Copy link
Collaborator

This refactors merge.py a bit, by moving some things into functions. Additionally it adds some basic tests and this time I switched to using pytest.

I ended up working on this because I realized I was adding more stuff in https://github.com/CCI-MOC/openshift-usage-scripts/pull/164/files and there were no tests.

Copy link
Collaborator

@QuanMPhm QuanMPhm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few small questions before I approve

Comment on lines +111 to +112
if cluster_name is None:
cluster_name = metrics_from_file.get("cluster_name")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any concern that cluster_name is not being checked? What if the provided files are from different clusters. It seems this behavior has been in the code prior to this refactoring, but wanted to ask just in case

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we checked that cluster_name could be different in different files, so this behavior is unchanged. But it doesn't hurt to add that additional check. I'll add that in a different PR.

gpu_a100sxm4=rates_data.get_value_at(
cpu=rates_data.get_value_at("CPU SU Rate", report_month, Decimal), # type: ignore
gpu_a100=rates_data.get_value_at("GPUA100 SU Rate", report_month, Decimal), # type: ignore
gpu_a100sxm4=rates_data.get_value_at( # type: ignore
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you using a linter additional to the ruff that we use in the CI? I didn't got any pre-commit errors when removing these comments

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was vscode yelling at me so I put these, but I am going to remove these.

Comment on lines +87 to +95
with open(file, "r") as jsonfile:
metrics_from_file = json.load(jsonfile)
cpu_request_metrics = metrics_from_file["cpu_metrics"]
memory_request_metrics = metrics_from_file["memory_metrics"]
gpu_request_metrics = metrics_from_file.get("gpu_metrics", None)
processor.merge_metrics("cpu_request", cpu_request_metrics)
processor.merge_metrics("memory_request", memory_request_metrics)
if gpu_request_metrics is not None:
processor.merge_metrics("gpu_request", gpu_request_metrics)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor suggestion, but I think this is more concise:

Suggested change
with open(file, "r") as jsonfile:
metrics_from_file = json.load(jsonfile)
cpu_request_metrics = metrics_from_file["cpu_metrics"]
memory_request_metrics = metrics_from_file["memory_metrics"]
gpu_request_metrics = metrics_from_file.get("gpu_metrics", None)
processor.merge_metrics("cpu_request", cpu_request_metrics)
processor.merge_metrics("memory_request", memory_request_metrics)
if gpu_request_metrics is not None:
processor.merge_metrics("gpu_request", gpu_request_metrics)
for resource in ["cpu_metrics", "memory_metrics", "gpu_metrics"]:
if resource == "gpu_metrics":
if gpu_request_metrics := metrics_from_file.get(resource):
processor.merge_metrics(resource, gpu_request_metrics)
else:
request_metrics = metrics_from_file[resource]
processor.merge_metrics(resource, request_metrics)

If cpu_metrics and memory_metrics is always present, is it fine to make the loop even simpler?

for resource in ["cpu_metrics", "memory_metrics", "gpu_metrics"]:
    if request_metrics := metrics_from_file.get(resource):
        processor.merge_metrics(resource, request_metrics)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good suggestion

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uh, while this is a good suggestion, due to the old clumsy naming of things it'll break stuff. See, the files put things in cpu_metrics but the processors call it cpu_request so our neat little loop won't work. I am going to leave this as is.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry, the strings looked similar and I thought they were the same

return processor


def load_metadata(files: List[str]) -> MetricsMetadata:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I could load data and metadata in a single loop instead of loading files twice. And for that I reason I don't like what I've done. I am going to refactor it again later.

@naved001 naved001 marked this pull request as draft November 19, 2025 21:59
naved001 added 4 commits March 2, 2026 15:16
Use MetricsMetadata dataclass to pass metrics metrics data.
Move tasks out of main into their own functions.

Slightly rearrange the order of operations in main.
And switch to pytests as the test runner.
@naved001 naved001 force-pushed the refactor-merge-py branch from 717ddb2 to 38e6c9f Compare March 3, 2026 17:38
@naved001 naved001 marked this pull request as ready for review March 3, 2026 18:59
@naved001
Copy link
Collaborator Author

naved001 commented Mar 3, 2026

@knikolla @QuanMPhm I think this is ready for review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants