Skip to content

feat: multi cluster support for the listener #240

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 28 commits into
base: main
Choose a base branch
from

Conversation

vertex451
Copy link
Member

@vertex451 vertex451 commented May 30, 2025

Resolves #149

Changes

Gateway

  1. Refactored gateway/manager and took out its parts into separate packages:
  2. At some point I realized that we have map for GraphQL schemas and map for resolvers/runtime-clients and they are coupled due to relation to a single cluster.
    And I migrated to a domain driven design - introduced targetcluster package that represents a single k8s cluster. So now, instead of having several maps, we have one map of clusters, where each single cluster holds its related schema, resolvers and runtime clients.

Listener

  1. Refactored code to divide between direct KCP kubeconfig and clusteraccess using Strategy pattern
  2. Used golang-commons lifecycle package for Reconcilers

vertex451 added 5 commits May 30, 2025 17:39
On-behalf-of: @SAP [email protected]
Signed-off-by: Artem Shcherbatiuk <[email protected]>
On-behalf-of: @SAP [email protected]
Signed-off-by: Artem Shcherbatiuk <[email protected]>
On-behalf-of: @SAP [email protected]
Signed-off-by: Artem Shcherbatiuk <[email protected]>
On-behalf-of: @SAP [email protected]
Signed-off-by: Artem Shcherbatiuk <[email protected]>
vertex451 added 19 commits June 9, 2025 19:38
On-behalf-of: @SAP [email protected]
Signed-off-by: Artem Shcherbatiuk <[email protected]>
On-behalf-of: @SAP [email protected]
Signed-off-by: Artem Shcherbatiuk <[email protected]>
On-behalf-of: @SAP [email protected]
Signed-off-by: Artem Shcherbatiuk <[email protected]>
On-behalf-of: @SAP [email protected]
Signed-off-by: Artem Shcherbatiuk <[email protected]>
On-behalf-of: @SAP [email protected]
Signed-off-by: Artem Shcherbatiuk <[email protected]>
On-behalf-of: @SAP [email protected]
Signed-off-by: Artem Shcherbatiuk <[email protected]>
On-behalf-of: @SAP [email protected]
Signed-off-by: Artem Shcherbatiuk <[email protected]>
On-behalf-of: @SAP [email protected]
Signed-off-by: Artem Shcherbatiuk <[email protected]>
On-behalf-of: @SAP [email protected]
Signed-off-by: Artem Shcherbatiuk <[email protected]>
On-behalf-of: @SAP [email protected]
Signed-off-by: Artem Shcherbatiuk <[email protected]>
On-behalf-of: @SAP [email protected]
Signed-off-by: Artem Shcherbatiuk <[email protected]>
On-behalf-of: @SAP [email protected]
Signed-off-by: Artem Shcherbatiuk <[email protected]>
On-behalf-of: @SAP [email protected]
Signed-off-by: Artem Shcherbatiuk <[email protected]>
On-behalf-of: @SAP [email protected]
Signed-off-by: Artem Shcherbatiuk <[email protected]>
On-behalf-of: @SAP [email protected]
Signed-off-by: Artem Shcherbatiuk <[email protected]>
On-behalf-of: @SAP [email protected]
Signed-off-by: Artem Shcherbatiuk <[email protected]>
On-behalf-of: @SAP [email protected]
Signed-off-by: Artem Shcherbatiuk <[email protected]>
Copy link
Member

@aaronschweig aaronschweig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments in the PR.

In general I think the PR is already in quite a good state. What I can see though is a diversion from the usual pattern in which we write operators.

A few things to note:

  • to adhere to best practices the usage of our lifecycle package in the golang-commons library is encouraged
    Artem: Done
  • Keep the Reconcilers straight. I am not sold completely on the introduction of the strategies pattern. It feels a bit too much, for something that is determined once on startup. I could understand the introduction of such a pattern a bit more, if we would have on reconciler potentially working on multiple different patterns/inputs. But since we have a KCP reconciler, which is dedicated to performing specific KCP tasks, a ClusterAccess resource reconciler which is able to reconcile ClusterAccesses and non of them are managed by a single reconciler (which would not be ideal anyway) the strategies feel a bit like an overengineered solution for a simple configuration switch in the startup period of the operator
    Artem: Replaced strategy patter with conditional approach
  • I do think it is a good idea, to abstract a cluster layer, that handles a bit the logic of getting a client and the necessary information to gateway to a cluster
  • I didn't look into the tests yet, since you mentioned there were still WIP
  • A small wish from mine: Don't code it like Java. Whilst I sometimes see the benefits of introducing design patterns, I want to avoid to have too many indirections or abstractions where not absolutely necessary. Within the framework of controller-runtime and the combination with the golang-commons library there already are a lot of useful abstractions that help us build mantainable operators, so maybe looking into them, also the pattern of subroutines (term used to encapsulate a specific portion of reconcilation logic) could help when trying to find the right way to abstract/implement the logic needed.
    Artem: Appreciate your feedback, will keep in mind the idea of following the idiomatic Golang way

Comment on lines 80 to 95
tmpFile, err := os.CreateTemp("", "kubeconfig-*.yaml")
if err != nil {
return nil, fmt.Errorf("failed to create temporary kubeconfig file: %w", err)
}
defer os.Remove(tmpFile.Name())

if _, err := tmpFile.Write(kubeconfigData); err != nil {
tmpFile.Close()
return nil, fmt.Errorf("failed to write kubeconfig to temporary file: %w", err)
}
tmpFile.Close()

// Build config from kubeconfig
config, err := clientcmd.BuildConfigFromFlags("", tmpFile.Name())
if err != nil {
return nil, fmt.Errorf("failed to build config from kubeconfig: %w", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there should be another option to parse the necessary information from the auth metadata. If it is structured, there probably is the correct struct already hidden somewhere, without the need to use a temporary file

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed temporary file usage.

if tc.metadata != nil && tc.metadata.Path != "" {
path = tc.metadata.Path
}
return fmt.Sprintf("http://localhost:%s/%s/graphql", appCfg.Gateway.Port, path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is only valid for locally running services. If this GetEndpoint is needed for something else than logging, this is not the correct approach

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now it logs new ednpoint with localhost only in case of local_development=true.
Otherwise it logs /workspace/graphql

}

if metadata.Auth.Kubeconfig == "" {
return nil, fmt.Errorf("kubeconfig data is empty")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no interpolation made, so errors.New is sufficient

Comment on lines 3 to 21
// ClusterMetadata represents cluster connection information embedded in schema files
// Simplified for standard Kubernetes clusters with kubeconfig authentication
type ClusterMetadata struct {
Host string `json:"host"`
Path string `json:"path"`
Auth *AuthMetadata `json:"auth,omitempty"`
}

// AuthMetadata contains kubeconfig authentication for standard Kubernetes clusters
type AuthMetadata struct {
Type string `json:"type"` // Only "kubeconfig" is supported
Kubeconfig string `json:"kubeconfig"` // base64 encoded kubeconfig
}

// FileData represents the data extracted from a schema file
type FileData struct {
Definitions map[string]interface{} `json:"definitions"`
Metadata *ClusterMetadata `json:"x-cluster-metadata"`
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not the biggest fan of type only files - it makes most sense to colocate them to their usage if possible

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, I don't like the separate type files - please colocate to their usage if possible

Comment on lines 121 to 126
case CRDRegisteredNoObjects:
log.Info().Msg("Using ClusterAccess reconciler - waiting for ClusterAccess objects to be created")
return newClusterAccessReconciler(opts, discoveryInterface, preReconcileFunc, log)
case CRDRegisteredWithObjects:
log.Info().Msg("Using ClusterAccess reconciler")
return newClusterAccessReconciler(opts, discoveryInterface, preReconcileFunc, log)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does seem a bit unnessesary to differentiate between no cluster access resources available and if there are some present, especially since there is no special handling of these cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplified

Comment on lines 137 to 140
if err := k8sClient.List(ctx, clusterAccessList); err != nil {
log.Info().Err(err).Msg("ClusterAccess CRD not registered")
return CRDNotRegistered
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there could be a multitude of failures leading to this error messages, all but one indicating a maybe totally different failure than a missing CRD.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made logging and error's return value more specific

Comment on lines +116 to +318
if in.SecretRef != nil {
in, out := &in.SecretRef, &out.SecretRef
*out = new(SecretRef)
**out = **in
}
if in.ConfigMapRef != nil {
in, out := &in.ConfigMapRef, &out.ConfigMapRef
*out = new(ConfigMapRef)
**out = **in
}
}

func (in *CAConfig) DeepCopy() *CAConfig {
if in == nil {
return nil
}
out := new(CAConfig)
in.DeepCopyInto(out)
return out
}

func (in *AuthConfig) DeepCopyInto(out *AuthConfig) {
*out = *in
if in.SecretRef != nil {
in, out := &in.SecretRef, &out.SecretRef
*out = new(SecretRef)
**out = **in
}
if in.KubeconfigSecretRef != nil {
in, out := &in.KubeconfigSecretRef, &out.KubeconfigSecretRef
*out = new(KubeconfigSecretRef)
**out = **in
}
if in.ClientCertificateRef != nil {
in, out := &in.ClientCertificateRef, &out.ClientCertificateRef
*out = new(ClientCertificateRef)
**out = **in
}
}

func (in *AuthConfig) DeepCopy() *AuthConfig {
if in == nil {
return nil
}
out := new(AuthConfig)
in.DeepCopyInto(out)
return out
}

func (in *SecretRef) DeepCopyInto(out *SecretRef) {
*out = *in
}

func (in *SecretRef) DeepCopy() *SecretRef {
if in == nil {
return nil
}
out := new(SecretRef)
in.DeepCopyInto(out)
return out
}

func (in *ConfigMapRef) DeepCopyInto(out *ConfigMapRef) {
*out = *in
}

func (in *ConfigMapRef) DeepCopy() *ConfigMapRef {
if in == nil {
return nil
}
out := new(ConfigMapRef)
in.DeepCopyInto(out)
return out
}

func (in *KubeconfigSecretRef) DeepCopyInto(out *KubeconfigSecretRef) {
*out = *in
}

func (in *KubeconfigSecretRef) DeepCopy() *KubeconfigSecretRef {
if in == nil {
return nil
}
out := new(KubeconfigSecretRef)
in.DeepCopyInto(out)
return out
}

func (in *ClientCertificateRef) DeepCopyInto(out *ClientCertificateRef) {
*out = *in
}

func (in *ClientCertificateRef) DeepCopy() *ClientCertificateRef {
if in == nil {
return nil
}
out := new(ClientCertificateRef)
in.DeepCopyInto(out)
return out
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These normally get generated in a separate file

Comment on lines +74 to +83
apiBinding := &kcpapis.APIBinding{}
if err := r.opts.Client.Get(ctx, req.NamespacedName, apiBinding); err != nil {
// If the resource is not found, it might have been deleted
if client.IgnoreNotFound(err) == nil {
r.log.Info().Str("apiBinding", req.Name).Msg("APIBinding resource not found, might have been deleted")
return ctrl.Result{}, nil
}
r.log.Error().Err(err).Str("apiBinding", req.Name).Msg("failed to fetch APIBinding resource")
return ctrl.Result{}, err
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is done by the shared library

Comment on lines +98 to +107
clusterAccess := &gatewayv1alpha1.ClusterAccess{}
if err := r.opts.Client.Get(ctx, req.NamespacedName, clusterAccess); err != nil {
if client.IgnoreNotFound(err) == nil {
r.log.Info().Str("clusterAccess", req.Name).Msg("ClusterAccess resource not found, might have been deleted")
return ctrl.Result{}, nil
}
r.log.Error().Err(err).Str("clusterAccess", req.Name).Msg("failed to fetch ClusterAccess resource")
return ctrl.Result{}, err
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done by the shared library

On-behalf-of: @SAP [email protected]
Signed-off-by: Artem Shcherbatiuk <[email protected]>
Copy link

@pteich pteich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current structure works for me. I can understand what you want to achieve and can follow along. But the reconciler factory for me feels like an unnecessary abstraction. Besides of that I left some additional comments here and there.

"github.com/openmfp/golang-commons/logger"
"github.com/pkg/errors"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yet another error package? what is the specific use for it?

@@ -48,21 +47,14 @@ var gatewayCmd = &cobra.Command{

ctrl.SetLogger(zap.New(zap.UseDevMode(true)))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not part of the PR but as we had this discussion before this could be:

ctrl.SetLogger(log.Logr())

)

// CreateReconciler creates a reconciler based on the configuration
func CreateReconciler(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tbh I am not really sure about the factory. Especially as the only logic here is the switch based on the app config. For me it feels like an unnecessary abstraction as the user of the factory already knows what he want get back based on the config. So the different create functions could be public and the switch could be part of the caller code. This way at least for me it would be way more obvious what type of reconciler you got back.

return nil, err
}

return standard.NewReconciler(opts, restCfg, ioHandler, schemaResolver, discoveryInterface, restMapper, log)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would name the package singlecluster instead of "standard" to align with the purpose and create function name.

func NewClusterRegistry(
log *logger.Logger,
appCfg appConfig.Config,
roundTripperFactory func(*rest.Config) http.RoundTripper,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should use your already defined RoundTripperFactory type.

clusters map[string]*TargetCluster
log *logger.Logger
appCfg appConfig.Config
roundTripperFactory func(*rest.Config) http.RoundTripper
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should use your already defined RoundTripperFactory type.

}

// RoundTripperFactory creates HTTP round trippers for authentication
type RoundTripperFactory func(*rest.Config) http.RoundTripper
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you defined it but forgot to use it :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Architecture Proposal for Multi Cluster Listener
3 participants