Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add internal-cert-controller disable flag #2426

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Garvit-77
Copy link

What this PR does / why we need it:

Which issue(s) this PR fixes (optional, in Fixes #<issue number>, #<issue number>, ... format, will close the issue(s) when PR gets merged):
Fixes #2049

Signed-off-by: Garvit-77 <[email protected]>
controllerv2 "github.com/kubeflow/training-operator/pkg/controller.v2"
runtime "github.com/kubeflow/training-operator/pkg/runtime.v2"
runtimecore "github.com/kubeflow/training-operator/pkg/runtime.v2/core"
webhookv2 "github.com/kubeflow/training-operator/pkg/webhook.v2"
)

const (
webhookConfigurationName = "validator.training-operator-v2.kubeflow.org"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a small confusion should we use webhookConfigurationName instead of namespace(as of V1) ?

WebhookConfigurationName: webhookConfigurationName,
}, certsReady); err != nil {
setupLog.Error(err, "unable to set up cert rotation")
if err := createCertificate(mgr.GetClient(), namespace, webhookSecretName, webhookServiceName); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want this to be configurable, so cert-manager is an option, not a requirement?

My understanding is we still want it possible to install Kubeflow trainer without cert-manager.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cert-manager is optional. We should not create resources automatically in manager.
We just add Certifications by manifests, not by automatic mechanism like current impl.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tenzen-y can you please brief if any changes are required in current PR ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Prepare the flag whether or not the controller manager launch internal cert controller.
  2. Add CertManager installation manifests like https://github.com/kubeflow/katib/tree/master/manifests/v1beta1/installs/katib-cert-manager

Signed-off-by: Garvit-77 <[email protected]>
@google-oss-prow google-oss-prow bot added size/L and removed size/M labels Feb 12, 2025
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign tenzen-y for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@Garvit-77
Copy link
Author

@tenzen-y
I have made the changes accordingly so please review them,
thank you

WebhookConfigurationName: webhookConfigurationName,
}, certsReady); err != nil {
setupLog.Error(err, "unable to set up cert rotation")
if err := createCertificate(mgr.GetClient(), namespace, webhookSecretName, webhookServiceName); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned the previous comment, we should not remove and replace the existing CertManagement mechanism. We just provide the command flag whether or not manager uses internal-certmanagement mechanism.

  1. Add a flag --enable-internal-cert (default true)
  2. If the flag is false, manager does not perform cert.ManageCerts.
  3. Add installation manifests with CertManager

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tenzen-y yes i will revert the code back to cert generation by controller

Comment on lines +158 to +185
// Cert generation Using cert-manager
func createCertificate(client client.Client, namespace, secretName, serviceName string) error {
cert := &cmapi.Certificate{
ObjectMeta: metav1.ObjectMeta{
Name: secretName,
Namespace: namespace,
},
Spec: cmapi.CertificateSpec{
SecretName: secretName,
IssuerRef: cmmeta.ObjectReference{
Name: "letsencrypt-prod",
Kind: "ClusterIssuer",
},
CommonName: serviceName,
DNSNames: []string{serviceName},
Usages: []cmapi.KeyUsage{
cmapi.UsageServerAuth,
cmapi.UsageClientAuth,
},
},
}

if err := client.Create(context.Background(), cert); err != nil {
return fmt.Errorf("failed to create certificate: %v", err)
}
return nil
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Cert generation Using cert-manager
func createCertificate(client client.Client, namespace, secretName, serviceName string) error {
cert := &cmapi.Certificate{
ObjectMeta: metav1.ObjectMeta{
Name: secretName,
Namespace: namespace,
},
Spec: cmapi.CertificateSpec{
SecretName: secretName,
IssuerRef: cmmeta.ObjectReference{
Name: "letsencrypt-prod",
Kind: "ClusterIssuer",
},
CommonName: serviceName,
DNSNames: []string{serviceName},
Usages: []cmapi.KeyUsage{
cmapi.UsageServerAuth,
cmapi.UsageClientAuth,
},
},
}
if err := client.Create(context.Background(), cert); err != nil {
return fmt.Errorf("failed to create certificate: %v", err)
}
return nil
}

WebhookConfigurationName: webhookConfigurationName,
}, certsReady); err != nil {
setupLog.Error(err, "unable to set up cert rotation")
if err := createCertificate(mgr.GetClient(), namespace, webhookSecretName, webhookServiceName); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if err := createCertificate(mgr.GetClient(), namespace, webhookSecretName, webhookServiceName); err != nil {

Copy link
Member

@andreyvelich andreyvelich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering do we really need to have cert-manager installation ?
Can we re-use OPA cert-manager rotator to give user flexibility to control certificate or cert-manager will give us additional features ?
cc @kubeflow/release-team @kubeflow/wg-manifests-leads

@tenzen-y
Copy link
Member

I am wondering do we really need to have cert-manager installation ? Can we re-use OPA cert-manager rotator to give user flexibility to control certificate or cert-manager will give us additional features ? cc @kubeflow/release-team @kubeflow/wg-manifests-leads

I do not have a strong request for CertManager support. One thing is productionizing trainer since they sometimes want to use the certified certifications (OPA internal-certs generates self-signed certs).

As an alternative and minimized solution, we could consider supporting only disabling the OPA internal-cert manager and then giving them to specify arbitrary certifications.

@andreyvelich
Copy link
Member

OPA internal-certs generates self-signed certs

Does OPA support only self-signed certs @tenzen-y ?

As an alternative and minimized solution, we could consider supporting only disabling the OPA internal-cert manager and then giving them to specify arbitrary certifications.

Yeah, this is what I was thinking as well. Like do we really need to explain cluster admins on how to configure cert manager to generate certs for Kubeflow Trainer webhook?

@andreyvelich
Copy link
Member

FYI, as I can see even with Cert Manager right now we use self-signed certs by default:
https://github.com/kubeflow/manifests/blob/master/apps/katib/upstream/installs/katib-cert-manager/certificate.yaml#L22

@tenzen-y
Copy link
Member

OPA internal-certs generates self-signed certs

Does OPA support only self-signed certs @tenzen-y ?

Yes, that is OPA internal-certs objective.

As an alternative and minimized solution, we could consider supporting only disabling the OPA internal-cert manager and then giving them to specify arbitrary certifications.

Yeah, this is what I was thinking as well. Like do we really need to explain cluster admins on how to configure cert manager to generate certs for Kubeflow Trainer webhook?

Documentation might be better. @astefanutti Do you see any use cases where customers want to use certified certifications for admission webhook controllers?

@astefanutti
Copy link
Contributor

Do you see any use cases where customers want to use certified certifications for admission webhook controllers?

No, I haven't personally seen any customer requests / requirements to have full-fledged certificate management / PKI for admission webhooks.

cert-manager is more driven by the need to integrate with trusted certificate authority like let's encrypt for external ingress / gateway, not for in-cluster communication between control plane components.

Now I can understand that if cert-manager is already deployed, it can be used to manage internal certificates as well.
But in that case, I'd also lean toward adding a CLI flag to disable cert-controller and document an external solution should be provided, such as cert-manager for example.

@andreyvelich
Copy link
Member

cc @kubeflow/release-team @kubeflow/kubeflow-steering-committee @juliusvonkohout @franciscojavierarceo
It looks like we don't really need to have cert-manager as a hard dependency for Kubeflow projects.
So we can simplify Kubeflow control plane complexity: kubeflow/manifests#2451
cc @brsolomon-deloitte @jbottum

@tenzen-y
Copy link
Member

Do you see any use cases where customers want to use certified certifications for admission webhook controllers?

No, I haven't personally seen any customer requests / requirements to have full-fledged certificate management / PKI for admission webhooks.

cert-manager is more driven by the need to integrate with trusted certificate authority like let's encrypt for external ingress / gateway, not for in-cluster communication between control plane components.

Now I can understand that if cert-manager is already deployed, it can be used to manage internal certificates as well. But in that case, I'd also lean toward adding a CLI flag to disable cert-controller and document an external solution should be provided, such as cert-manager for example.

Thank you for getting back feedback! In that case, it would be better to add only flag to disable internal cert controller and then enhance our documentations how to use arbitrary certificates for the admission webhook controllers.

@tenzen-y
Copy link
Member

Let me summarize for @Garvit-77: we decided not to support CertManger. So, if you are ok, could you convert this PR to just add a flag to disable and enable the internal cert controller?

@juliusvonkohout
Copy link
Member

cc @kubeflow/release-team @kubeflow/kubeflow-steering-committee @juliusvonkohout @franciscojavierarceo It looks like we don't really need to have cert-manager as a hard dependency for Kubeflow projects. So we can simplify Kubeflow control plane complexity: kubeflow/manifests#2451 cc @brsolomon-deloitte @jbottum

There is also KFP and maybe others. But yes, we can remove it as soon as no one is using it anymore.

@Garvit-77
Copy link
Author

@tenzen-y Surely , i did understood the conversation except about the OPA
and I would be align with the community
So just making the changes for a Flag in the Updated PR

@andreyvelich
Copy link
Member

There is also KFP and maybe others. But yes, we can remove it as soon as no one is using it anymore.

If KFP uses it in the same way that other Kubeflow projects is using, we can easily remove this dependency.
@kubeflow/wg-pipeline-leads @rimolive @anishasthana @HumairAK @hbelmiro @mprahl do you know how cert-manager is currently used in KFP ?

@juliusvonkohout
Copy link
Member

There is also KFP and maybe others. But yes, we can remove it as soon as no one is using it anymore.

If KFP uses it in the same way that other Kubeflow projects is using, we can easily remove this dependency. @kubeflow/wg-pipeline-leads @rimolive @anishasthana @HumairAK @hbelmiro @mprahl do you know how cert-manager is currently used in KFP ?

Signed certificates for webhooks are not a bad thing.

When to Use cert-manager:

  1. Automatic Certificate Management: If your webhooks require TLS and you want to automate the issuance and renewal of certificates, cert-manager can simplify this process significantly.
  2. Security: Using TLS for webhooks enhances security by encrypting the traffic between your services.
  3. Dynamic Environments: In environments where services are frequently created and destroyed, cert-manager can help manage certificates dynamically.

@juliusvonkohout
Copy link
Member

cc @thesuperzapper

@tenzen-y
Copy link
Member

/retitle add internal-cert-controller disable flag

@google-oss-prow google-oss-prow bot changed the title certManager add internal-cert-controller disable flag Feb 17, 2025
@tenzen-y
Copy link
Member

/retitle Add internal-cert-controller disable flag

@google-oss-prow google-oss-prow bot changed the title add internal-cert-controller disable flag Add internal-cert-controller disable flag Feb 17, 2025
@tenzen-y
Copy link
Member

tenzen-y commented Feb 17, 2025

@juliusvonkohout @andreyvelich Should we prepare the dedicated issue to discuss CertManager entirely KF?
PR contributors might be confusing with that.

@andreyvelich
Copy link
Member

I agree with @tenzen-y, I will create dedicated issue in kubeflow/manifests to discuss removal for cert-manager from the Kubeflow Manifests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support CertManager for the Webhook cert generation
5 participants