-
Notifications
You must be signed in to change notification settings - Fork 362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deployment scale refactor #4212
Conversation
* also start separating total instance count from current scale target
expect(up_scaler.finished_scaling?).to be false | ||
end | ||
|
||
# TODO: Currently, deployments transition to COMPLETED even if there are some instances that have not started and become routable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was able to reproduce this by adding this to Dora's healthcheck endpoint:
if ENV['CF_INSTANCE_INDEX'].to_i==3
status 500
return
end
Given 6 instances and max of flight=2, the deployment will complete before the instance has started/become routable. Will fix in a future PR
@@ -1,4 +1,5 @@ | |||
require 'prometheus/client' | |||
require 'prometheus/client/data_stores/direct_file_store' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was so I could use the lightweight_spec_helper for some tests
41df631
to
8de2953
Compare
end | ||
|
||
def desired_non_deploying_instances | ||
[target_total_instance_count - @routable_instance_count, 0].max |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit Should we be consistent on direct access vs attribute reader here?
|
||
def scale_down_old_processes | ||
instances_to_reduce = non_deploying_web_processes.map(&:instances).sum - desired_non_deploying_instances | ||
return true if up_scaler.finished_scaling? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question Can we return true here if the downscaler is still doing its thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think so. In rolling deployments, the updater will call Finalize
after this returns true, deleting the old web processes anyway. In other words, once all instances are scaled up for the new web process, we can just wrap things up.
That's the logic I preserved in the refactor, anyway. I noted a few bugs I found with it. We'll also have to think if that still makes sense for Canary deployments when this will return true on each step. I suppose that question is--does a Canary deployment transition to PAUSED when all Canary instances are brought up or do we also have to wait for old instances to be brought down?
@@ -105,6 +106,17 @@ def crashed_instances_for_app(process) | |||
raise CloudController::Errors::InstancesUnavailable.new(e) | |||
end | |||
|
|||
def instance_count_summary(process) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise big fan of pushing this out of the updater and into the reporter
* also move logic into instance_reporters
8de2953
to
dce0cbb
Compare
Changes in cloud_controller_ng: - Deployment scale refactor PR: cloudfoundry/cloud_controller_ng#4212 Author: Seth Boyles <[email protected]>
Thanks for contributing to cloud_controller_ng. To speed up the process of reviewing your pull request please provide us with:
A short explanation of the proposed change:
An explanation of the use cases your change solves
Links to any other associated PRs
I have reviewed the contributing guide
I have viewed, signed, and submitted the Contributor License Agreement
I have made this pull request to the
main
branchI have run all the unit tests using
bundle exec rake
I have run CF Acceptance Tests