-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Google Summer of Code 2025 - Please Submit Your Project Ideas #809
Comments
@rareddy are you thinking on having this content on the website? otherwise I'd recommend to create the issue on the community repository. |
/transfer community |
@rareddy , I would like to be a mentor and interested on spark-operator project. I will be sharing my ideas/project proposals soon. |
@rareddy Hi, I was a GSoC student last year, and also a new maintainer in Kubeflow. Benefit from the GSoC project a lot, I would like to apply for the mentor of training-operator and katib this year. I'm sure it will help more new contributors like me to get engaged in Kubeflow and the world of open source. As for the project ideas, not yet. But I'll put them here once they are ready. cc @kubeflow/wg-automl-leads @kubeflow/wg-training-leads |
@rareddy here are my rough ideas Project Idea 1Project Title: multi-tenancy/security, Kubeflow repository Migration, and CI/CD Enhancement Detailed Description:
Expected Outcomes:
Skills Required/Preferred:
Possible Mentors:
Component: Kubeflow Difficulty: Medium Expected Size of the Project: 175/350 hours Project Idea 2Project Title: Maintain Kserve Models Web Application such that we have a proper UI for model serving. Detailed Description:
Expected Outcomes:
Skills Required/Preferred:
Possible Mentors:
Component: Kserve UI Difficulty: Medium Expected Size of the Project: 175+ hours |
@Ramesh I updated project idea one above for multi-tenancy, and seaweedfs
|
I'm new contributor to the project, trying to join the meetings and contribute to components. I'd like also to apply as a mentee. over the past few weeks @Electronic-Waste has been very helpful, and I would appreciate the opportunity to have them as my mentor, as I believe I can learn a lot from their experience. |
@kimwnasptd what do you think about kubeflow/manifests#2676 (comment) Project Idea 3Project Title: Make our service mesh rootless by default and provide secure model inference by default. Detailed Description: Expected Outcomes:
Skills Required/Preferred:
Possible Mentors:
Component: Kubeflow Difficulty: small |
@juliusvonkohout I'm super interested in this! Although I'd suggest us to be a bit careful when defining the scope, to not end up shooting ourselves in the foot by committing to something that in the end might not be fully in our hand to implement. Some gray areas I have in mind and worry me a bit:
In any case though, happy to be a mentor for that one and we can figure out a scope that could fit this effort. I'll also be looking at it with the Canonical team, so I'll have cycles to help on this as well. |
I changed it to |
Interested in mentoringRamesh Oswal git-Id: rameshoswal |
@RameshOswal one of the requirements I am placing on mentors is, you either work with an existing lead on a component to come up with a project that you can be a mentor on. so please propose a project idea. |
Anyone who is looking for a junior developer and has enough time to mentor, please reach out. I am eager to develop my skills and gain experience. Feel free to connect with me on linkdin Thank you! |
@Girma35 you will probably reach us better via the Kubeflow Slack channels https://www.kubeflow.org/docs/about/community/. There are also enough "good first issues" marked here https://github.com/kubeflow/manifests/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22 and you might want to dive into some of the project ideas listed in this thread here. |
@rareddy here's our proposal Title: Enhancing Kubeflow Data Science Experience with Jupyter-Centric ToolsDescription & OutcomesThis project is dedicated to elevating the data science workflow within Kubeflow by creating a suite of tools and extensions that deliver a seamless, Jupyter-native user experience. The work begins with reviving and modernizing the existing Kale tool – ensuring compatibility with the latest Jupyter and Kubeflow Pipelines (KFP v2) standards – and then, optionally and depending on candidate progress, extends those capabilities by introducing new functionality. These enhancements include interactive pipeline visualizations and additional integrations, all designed to simplify and streamline the development user experience within Kubeflow. Phase 1: Reviving and Modernizing Kale
Phase 2: (optional) Extending Capabilities with Jupyter-Centric Enhancements
Phase 3: Consolidation and Final Documentation
Additional Notes:
Skills Required
MentorsComponentKubeflow IDE / Kubeflow Jupyter Extension (IDE Working Group) DifficultyMedium GitHub Issues ReferencesExpected Size of the Project350 Hours |
Project title/description:Enhancing Kubeflow with Batch Processing Gateway for Scalable and Efficient Multi-Cluster Spark Job Management Detailed description of the project:Managing Apache Spark jobs in a multi-cluster Kubernetes environment presents several challenges, including inefficient workload distribution, performance bottlenecks, and debugging complexities. Kubeflow users working with large-scale data processing and machine learning workflows require a streamlined, cloud-native experience that optimizes Spark job execution, monitoring, and debugging. Key Objectives:
Mentors expecting GSoC project execution plan - phase wise with weekly milestone that student want to achieve. Expected Outcomes:
Skills Required/Preferred:**Curious to learn, design, develop and having below skillsets will be plus : **
Possible Mentors:
Component:Difficulty:
GitHub Issue:Expected Project Size:300+ hours By undertaking this project, contributors will play a crucial role in enhancing the performance, efficiency, and usability of Spark applications within the Kubeflow ecosystem, thereby addressing current challenges in multi-cluster job management. Why This Project?By integrating Batch Processing Gateway with Kubeflow Notebooks, this project provides a cloud-native, scalable, and user-friendly solution for Spark job execution, debugging. It will significantly improve performance, efficiency, and developer experience, enabling ML practitioners and data engineers to focus on experimentation and optimization without struggling with job management complexities. Batch Processing Gateway can help enterprise environments, managing Apache Spark jobs across multiple Kubernetes clusters presents challenges such as inefficient job distribution, performance bottlenecks, and complex workload balancing. Integrating the Batch Processing Gateway (BPG) into Kubeflow aims to address these issues by providing a centralized mechanism for submitting, monitoring, and managing Spark applications across various clusters. |
@rareddy id love to mentor. Especially around manifests. |
@rareddy Here are draft ideas from Training and AutoML WG. Project 1Title: Enable GPU Testing for LLM Blueprints in Kubeflow Trainer Tracking issue: kubeflow/trainer#2432 Skills required: GitHub Actions, Kubernetes, PyTorch, Python Difficulty: hard Length: 350 hrs Mentors: TBD Component: Kubeflow Trainer Project 2Title: Support JAX and/or TensorFlow Training Runtimes in Kubeflow Trainer Tracking issue: TBD Skills required: Go, Kubernetes, JAX, TensorFlow Difficulty: medium Length: 350 hrs Mentors: TBD Component: Kubeflow Trainer Project 3Title: Support Kubernetes Sidecars for Katib Metrics Collectors Detailed Description Katib implements Pull-based Metrics Collector as a sidecar container to collect training metrics from the Trials once training is complete. However, the Pull-based Metrics Collector has some problems. For example, the Trial will fail if the training container is finished before Metrics Collector is started. Expected Outcomes Successfully integrate Katib Metrics Collectors with Kubernetes Sidecars Provide some unit tests and e2e tests Skills Required/Preferred: Kubernetes, Go, YAML, Python Possible Mentors @Electronic-Waste , TBD Component: Kubeflow Katib Difficulty: Medium Expected Size of the Project: 350 hours cc @Electronic-Waste @kubeflow/wg-training-leads |
@rareddy Here are more possible ideas for Training and AutoML WG Project 4Title: Export Fine-Tuned LLM to Model Registry Tracking issue: TBD Detailed Description: Trainer has implemented initializers for model and dataset, and will support model exporter in the future. By supporting the model registry as one of the destinations of the exporter, Trainer will integrate with Kubeflow ecosystem more deeply. Skills required: Kubernetes, Go, YAML, Python Difficulty: hard Length: 350 hrs Mentors: TBD Component: Kubeflow Trainer / Model Registry Project 5Title: Add Volcano scheduler support in Trainer Tracking issue: TBD Detailed Description: Currently, Trainer does not support Volcano for scheduling. Since Volcano is a widely adopted scheduler for AI workloads, it could provide Trainer with more AI-specific scheduling capabilities if we integrate Volcano into Trainer Skills required: Kubernetes, Go, Volcano Difficulty: hard Length: 350 hrs Mentors: TBD Component: Kubeflow Trainer cc @kubeflow/wg-training-leads @kubeflow/wg-data-leads |
Yes, it could mean upstreaming the various Helm chart projects into Kubeflow/manifests. That alone would be a 175+ hours project on its own. |
@juliusvonkohout what if we keep it to Ambient, to not make it too broad, but focus it on updating the components owned by Kubeflow to work with Ambient Mesh:
?? |
i will try to incorporate that into the PR, but we should keep istio-cni as the simpler target as well. How far we get is another question. We can also extend if needed later. |
Here is the PR for anyone interested kubeflow/website#3991 |
I think we should remove the helm chart project for a few reasons:
NOTE: Because I know people love the idea of a helm chart, I want to highlight that anyone is welcome to create a Kubeflow distribution, and if people like your project they will use it. This is just like how Kubernetes has many distributions which approach the problem in different ways. |
This comment has been minimized.
This comment has been minimized.
@anencore94 It sounds issue like that project is better suited to be under the Kserve organization? Are you guys participating in GSoC? |
I thought kserve is under the kubeflow org even if the github organization is separated. Isn't it anymore? There is kserve as a part of kubeflow components as I can see here. https://github.com/kubeflow/kubeflow?tab=readme-ov-file#kubeflow-components , https://www.kubeflow.org/docs/started/architecture/#kubeflow-ecosystem @thesuperzapper |
@anencore94 KServe currently has its own governance, and is separately owned by the LFAI, rather than the CNCF. However, there are discussions about them coming back under Kubeflow, but not sure what the status on that is. That diagram is technically not correct, they are an "external add-on" (see sidebar). There has always been lots of debate around what to call them, but from a governance perspective, KServe is not controlled by the Kubeflow Steering Committee (KSC) so is not "part" of Kubeflow officially. |
Thanks for the clarification, @thesuperzapper! I understand that KServe operates independently under LFAI governance. Since my proposal mainly focused on integrating KServe with MLflow. I believe it might be more appropriate to submit it under the KServe GSoC (if they are participating officially). I'll hide my previous comment. Thanks. @rareddy
|
KServe is not participating GSOC. We are working on moving from LF AI & Data to CNCF. See cncf/toc#1367 |
@thesuperzapper Let's avoid any confusion here please. I've already discussed this with @yuzisun and @johnugeorge in the past. We've also been discussing the removal of the "external add-ons" concept from the Kubeflow website to prevent confusion when users ask, As @anencore94 correctly pointed out, we put KServe under this diagram: https://www.kubeflow.org/docs/started/introduction/#kubeflow-overview-diagram and also in Kubeflow Architecture diagram as part of Kubeflow Components: https://www.kubeflow.org/docs/started/architecture/#kubeflow-ecosystem And I don't see any reason to change it unless broader Kubeflow/KServe community or @kubeflow/kubeflow-steering-committee @juliusvonkohout @franciscojavierarceo have different opinion here. In case of GSoC, I am fine to not include KServe if we don't want to. |
Hello Kubeflow Community,
Kubeflow has officially submitted to be part of Goole Summer Code 2025. We do hope to get selected as one of the participating organizations this year. We had a very successful GSOC 2024. I thank all the Mentors and Students for making it successful.
I am looking for
Signing up Mentors. If you would like to be a mentor this season, please reach out to me on CNCF Slack or leave your Name and git ID here. I will invite all the 2024 mentors by default, so please use this form only if you are a new mentor.
Project Ideas for Kubeflow components. If you submit an idea, you must be a mentor (Kubeflow member and/or committer) or if you are a student submitting a proposal it must be approved by one of the component leads or a committer on that component.
Make sure your Project Ideas include ALL of the required information below:
Leave a single comment here for each proposal. As we approve them, we will collate all the projects at https://www.kubeflow.org/events/gsoc-2025/. If you have questions, please come and ask in CNCF Slack #kubeflow-gsoc-participants.
All mentors MUST be assigned to one or more projects.
The deadline to submit all the ideas is Feb 15th, 2025. So, please take some time and submit them. Thank you all for your support.
The text was updated successfully, but these errors were encountered: