Conversation
|
I feel like this makes the code less understandable to me. Look at all these new functions which take some of their state from Why can't all these new methods be functions and not have access to any kind of |
|
I agree that it looks a bit awkward for the functions to have so many parameters. My initial idea for structuring it this way was thinking ahead to a future refactor where strategy becomes a Parsl plugin. I thought it would be easier to move the functions around if they were more decoupled from object state. For the remaining parameters, the main reason I avoided putting them in self is that their state changes on every strategy execution. One alternative could be to introduce something like an ExecutorState object that encapsulates this evolving state, with an update_state() function returning the refreshed snapshot. I think that would be better. My main motivation overall was that, while writing the unit tests, it was quite difficult to reason about the possible strategy cases, so I tried to make it more explicit. I’ll make these changes and then you can let me know what you think. |
|
Thanks for taking a look at this, @Arthur-Prince ! I'll aim to get some time to review it this weekend |
|
Did you look? what do you think? |
|
Thanks for following up. Sorry, I haven't got to it yet |
WardLT
left a comment
There was a problem hiding this comment.
I think it's good to go, but do have another request if you have bandwidth and because you just reviewed this code's internals:
Could you document what the options are and how they differ in the elasticity documentation? A brief section after Parallelism would be excellent.
Does that work for you? If not, I'll just merge as-is.
| logger.debug('%s Executor %d active tasks, %d active slots, and %d/%d running/pending blocks', | ||
| prefix, self.executors[label].active_tasks, self.executors[label].active_slots, running, pending) | ||
|
|
||
| def _case_1_no_tasks(self, executor: BlockProviderExecutor, prefix: str,) -> None: |
There was a problem hiding this comment.
Would you mind adding docstring to these functions? Something brief is fine.
|
Also, I do agree that the class doesn't make much sense, but that might be a large change and the plugins will need some more thought. Adjusting the functions and - ideally - documenting expected behavior moves us towards that direction. |
I can do that, but it might take me about a week.
Sure, I'll add them.
Yes, that was also the direction we were planning on PR #4075, but in smaller and easier-to-review steps. |
|
I forgot to mention one behavior change related to case 1. Previously, scale-in would only happen when the executor idle duration In practice this edge case was previously masked by the |
WardLT
left a comment
There was a problem hiding this comment.
I'd like to propose a different direction for refactoring: getting rid of the logic where we implicitly define the strategy_type being passed between function by picking which of the wrapper functions (e.g., _strategy_simple) we choice, and instead making strategy_type a class attribute.
Setting strategy_type as a attribute will have the setting be controlled in the same way as any of the settings (like max_idletime).
That would also make strategize a formal class method able to be reflected in the API docs.
| capacity and requests scaling in by that amount, while respecting the | ||
| executor's minimum block limit. | ||
| """ | ||
| executor_state = self.executors[executor.label] |
There was a problem hiding this comment.
I'm with @benclifford on making executor state an argument. The first line of each of these functions is always to lookup the state. So, accessing the state should be pulled out of these classes and used elsewhere.
| else: | ||
| logger.debug("%s Not requesting any blocks, because at maxblocks already", prefix) | ||
|
|
||
| def _case_4b_more_slots_than_tasks(self, executor: BlockProviderExecutor, prefix: str, strategy_type: str) -> None: |
There was a problem hiding this comment.
If this PR's a step towards #4075, let's make it very clear what the arguments to these functions are. Each are slightly different. Some w/ and w/o the strategy_type.
| class Strategy: | ||
| """Scaling strategy. | ||
|
|
||
| As a workflow dag is processed by Parsl, new tasks are added and completed |
There was a problem hiding this comment.
Regarding refactoring the docs, all of this seems unrelated to what a "Strategy" class does and how to use it.
|
I'll try to clarify the direction I had in mind. My current understanding of the long-term goal is that Strategy should eventually become a plugin-like component. In that model I was imagining something like:
One complication is that today
Because of that, my idea was that those parameters would eventually be passed to the constructors of each concrete strategy class instead of being handled centrally. PR #4075 proposes migrating the strategy from the DFK to the So for now, the goal of this PR is only to clean up the existing code without changing behavior. The intention is to make it easier to later introduce the plugin-style strategy classes ( @benclifford mentioned that smaller PRs are easier to review, which is why some changes that I think should eventually happen are not included yet. For example:
self.strategies = {
None: self._strategy_init_only,
"none": self._strategy_init_only,
"simple": self._strategy_simple,
"htex_auto_scale": self._strategy_htex_auto_scale,
}since those functions would become separate strategy classes. There are also a few issues that I plan to address later:
The next PR I was planning is to remove the logic where Strategy owns the list of executors, and instead have After that is done, it should become clearer how the
I agree this makes sense. My initial plan was to do that when implementing the PR that removes the executor list from
Currently The reason is that only HTEX supports a "smart" scale-in, where the block that has been idle the longest (based on
Those comments were mostly useful for helping me understand how the class works internally. I agree they are not particularly strong as API documentation. The only actual mistake in the example is that it is missing It might be clearer if I first submit a PR that moves toward having one executor per strategy object, and then perform the deeper refactoring of the class afterwards. If you prefer, I can follow that order instead. For this PR specifically, I think the improvements I can make are:
All while keeping behavior unchanged. |
Description
This PR refactors
_general_strategyto reduce technical debt and improve readability.Each scaling case (Case 1, 2, 4a, 4b) was extracted into its own helper function, and the initialization logic was consolidated into a dedicated
init_strategyhelper to remove duplication.This is a structural refactor only. The goal is to make the strategy easier to understand. I believe this will also simplify future work if the strategy becomes a parsl module.
Changed Behaviour
No behavior changes. Scaling decisions and thresholds remain exactly the same.
Fixes
N/A
Type of change