-
Notifications
You must be signed in to change notification settings - Fork 731
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support JAX Runtimes #2442
Comments
/remove-label lifecycle/needs-triage |
Thank you for creating this @Electronic-Waste! We might require two KEPs for every Runtime, since it requires:
|
@andreyvelich SGTM. I'll split it into two seperate issues. |
Does "API" mean CRD? SDK API? |
If we need to create a new |
SGTM |
@andreyvelich @tenzen-y Shall we create two GSoC projects for supporting JAX/TF Runtimes? They need two separate KEPs. |
I am not sure if we have sufficient number of slots since we already propose 12 projects. |
@andreyvelich This is the first time for me to mentor students in GSoC. And I will serve as primary mentor for 2 projects and backup mentor for some others now. I'm not sure whether I could handle 3 projects... Anyway, let's discuss it in the upcoming WG Training/AutoML Call. |
What you would like to be added?
Part of: #2170
As we planned in the Kubeflow Trainer V2 API, we should support JAX runtime after we implement pytorch runtime.
The works include:
ClusterTrainingRuntime
for JAX (single-node, multi-nodes)/area runtime
/cc @kubeflow/wg-training-leads @saileshd1402 @astefanutti @juliusvonkohout @franciscojavierarceo @varodrig @rareddy @thesuperzapper @seanlaii @deepanker13 @helenxie-bit @Doris-xm @truc0 @mahdikhashan
Why is this needed?
This is planned in KEP-2170.
Love this feature?
Give it a 👍 We prioritize the features with most 👍
The text was updated successfully, but these errors were encountered: