-
Notifications
You must be signed in to change notification settings - Fork 1.5k
KEP #5309: first draft of Self-Orchestrating Pod KEP #5351
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: SergeyKanzhelev The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/cc |
the specified container in the Pod. | ||
- Declare the communication protocol between the kubelet and a container | ||
that is versioned and extensible. | ||
- Declare enough primitives to satisfy two scenarios: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the goal of this KEP limited to terminating or restarting containers? If so, wouldn't sharing the PID namespace be sufficient to achieve that?
|
||
- Reduced API Server Load: Workloads manage their own supporting containers | ||
without frequent API Server interactions. | ||
- Fine-Grained Workload Control: Pods can create and terminate sub-containers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The usecases below don't talk about creating - rather about terminating and restarting containers.
Creation becomes much more tricky I think - do we need to include it?
and lifecycle management. However, certain advanced use cases require | ||
self-managed, dynamic pod orchestration within a node while minimizing direct | ||
API Server interactions. This KEP proposes Self-Orchestrating Pods (SOPs), a | ||
mechanism that allows a pod to create, manage, and terminate its |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The usecase below don't talk about "create".
The KEP introduces the communication channel between kubelet and a container in | ||
a Pod, which may be extended to a lot of other scenarios in future. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not fully understand this KEP, maybe is because it requires some context:, why it is required a channel with the kubelet since is self orchestrated, is the process in the pod that instructs the kubelet to create a container in its own pod? why it does not run new processes instead of containers
- Reduced API Server Load: Workloads manage their own supporting containers | ||
without frequent API Server interactions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find odd the workload can work in isolation without having to connect back to the apiserver to get state
- Declare enough primitives to satisfy two scenarios: | ||
- Sidecar to be able to terminate the main container in the Pod effectively | ||
stopping the Job execution. | ||
- Sidecar to be able to restart the main container and receive a signal that it was | ||
restarted. This will allow in-place restart of a single Pod of a large training job to | ||
restart the job from the last checkpoint. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can not this be done today, process can kill other process within the conttainer, right?
## Proposal | ||
|
||
The overall idea will be to expose the gRPC endpoint from the container | ||
and declare it in the container spec. Kubelet will connect to this endpoint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this guaranteed? are these pods network pods or only host network pods? what happens with runtimes that are more complex like kata or gvisor?
I see > ### Error handling section also touches on this
ports: | ||
- containerPort: 50051 | ||
podManagement: | ||
port: 50051 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
everyone will be able to connect to this port, right?
// Response for command stream. | ||
message CommandResponse { | ||
oneof commandResponse { | ||
TerminateContainerCommandResponse terminate_container_response = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stderr + stdout?
1. The sidecar container orchestrates the job. Job is a heavy process requiring | ||
special GPU hardware connected with other Pod. | ||
2. The sidecar receives the signal that the job should be abruptly terminated | ||
and started from the beginning. | ||
3. Instead of terminating the whole Pod, sidecar issues a command to kubelet to | ||
restart a specific container. | ||
4. Kubelet will report back when the container is restarted. | ||
5. Sidecar may need to keep other sidecar containers running or have them also | ||
be restarted, depending on the function of that sidecar container. Ordering | ||
of requests to the kubelet to restart things will be a sidecar decision. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we run today containers in Pods, why is not an alternative for the Pod to run docker in docker or something like this and handle the entire container lifecycle itself instead of bringing this back to the kubelet?
Self-orchestrating Pod is a proposed new concept. This is a first draft of the KEP
/sig node