Dynamic allocation #73

cmelone · 2024-07-31T20:00:53Z

We would like to use historical resource utilization data to predict future usage, and assign container memory and CPU requests within Kubernetes -- to optimize utilization.

To ensure we don't have negative impacts on the CI, we are starting with requests.

Because we are simply creating resource requests, it will still be possible for jobs to use more CPU/RAM than we expect. The next phase of experimentation will focus on instituting limits.

The ability to predict usage and assign requests has been implemented in the gantry web api, but there are a few steps needed to complete this goal.

Merge CI scriptable config into spack
Perform tests to ensure the system will function as expected
Enable gantry in spack
Monitor results to see if resources are indeed being optimized and fix any issues
Implement assigning resource limits after designing an accurate prediction algorithm

cmelone · 2024-10-29T14:58:20Z

I've now deployed gantry onto the CI staging cluster and will be testing how successful we are...things to experiment with:

setting requests=limits
- this would guarantee that jobs would not escape their "cocoon" of resources, but it would also mean that we're leaving mem_max-mem_mean on the table, in exchange for more stability in the CI

how to evaluate: check the avg duration of each spec before/after dynamic allocation to see how performance has been impacted...cost as well?

cmelone added the feature New feature or request label Jul 31, 2024

cmelone self-assigned this Jul 31, 2024

cmelone mentioned this issue Jul 31, 2024

Project roadmap #71

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dynamic allocation #73

Dynamic allocation #73

cmelone commented Jul 31, 2024 •

edited

Loading

cmelone commented Oct 29, 2024 •

edited

Loading

Dynamic allocation #73

Dynamic allocation #73

Comments

cmelone commented Jul 31, 2024 • edited Loading

cmelone commented Oct 29, 2024 • edited Loading

cmelone commented Jul 31, 2024 •

edited

Loading

cmelone commented Oct 29, 2024 •

edited

Loading