Dont set scale to zero as default when creating an Endpoint #3062
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hello!
Pull Request overview
Details
As discussed internally, this PR sets the
scale_to_zero_timeout
toNone
by default. It's not entirely clear to me if themin_replica
/max_replica
is also intended to be set toNone
from the discussion though - we can make those changes still.Also, when someone specifies the
scale_to_zero_timeout
, we can help them out by overriding themin_replica
to 0, but we can also do nothing as users will get an error anyways that theminReplica
should not be 1.I've also added an example of how to create an endpoint that scales to zero, and I quickly tested that running that example with the new
scale_to_zero_timeout
&min_replica
defaults indeed creates an endpoint that doesn't scale to zero.cc @Wauplin @Vaibhavs10 @ErikKaum