Skip to content

Conversation

@hkmdxlftjf
Copy link

@hkmdxlftjf hkmdxlftjf commented Nov 3, 2025

feat: #2026

Hello,

I’ve extended the grafana.spec to include an httpRoute definition, for example:

spec:
  security:
    admin_password: secret
    admin_user: root
  httpRoute:
    spec:
      parentRefs:
      - group: gateway.networking.k8s.io
        kind: Gateway
        name: eg
      rules:
      - matches:
        - path:
            type: PathPrefix
            value: /

This configuration creates an HTTPRoute resource that binds the Grafana instance to the specified Gateway.

Additionally, the httpRoute spec looks like this:

spec:
  parentRefs:
  - group: gateway.networking.k8s.io
    kind: Gateway
    name: eg
  rules:
  - backendRefs:
    - group: ""
      kind: Service
      name: grafana-service
      namespace: default
      port: 3000
      weight: 1
    matches:
    - path:
        type: PathPrefix
        value: /

With this in place, the HTTPRoute will route traffic (via the Gateway) to the Grafana service on port 3000.

If there’s anything in the implementation that seems unreasonable or incorrect, please let me know.

Thank you!
Best regards,
Sure

@CLAassistant
Copy link

CLAassistant commented Nov 3, 2025

CLA assistant check
All committers have signed the CLA.

@github-actions github-actions bot added the documentation Issues relating to documentation, missing, non-clear etc. label Nov 3, 2025
@hkmdxlftjf
Copy link
Author

Hi @theSuess, could you please take a look when you have time? 🙏


// labelsSatisfyMatchExpressions checks if a given label set satisfies
// a list of Kubernetes label selector requirements.
func labelsSatisfyMatchExpressions(labels map[string]string, matchExpressions []metav1.LabelSelectorRequirement) bool {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain this change? Specifically, why it was added here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because there can be multiple listeners in the gateway , and each listener may ignore certain route rules through allowedRoutes — such as by namespace or labelSelector. When preferIngress is set to true, we need to return an available URL. The labelsSatisfyMatchExpressions check ensures that the current listener accepts this HTTPRoute, guaranteeing its availability. Additionally, this piece of code was inspired by the implementation here: https://github.com/grafana/grafana-operator/blob/master/controllers/controller_shared.go#L189

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I'm missing something, the code is identical to the one that already exists (except for comments). So, the question is why would you want to make a copy instead of calling the existing function instead? Also, the existing one is fully covered by tests, the copy is not.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I’ll remove this and reuse the existing one.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I move the labelsSatisfyMatchExpressions function to controllers/model/utils.go
? Directly using controllers.LabelsSatisfyMatchExpressions would cause a circular dependency.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you, please, still answer my question on why it was added in the first place? (and why only one comment was different) - It'll help me to understand the reasoning behind your PR better.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it because I wasn’t sure whether it was okay to modify the existing files, so I created a separate copy instead.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Essentially, I'm trying to make sure that we do not accept any unreasonable changes introduced by an LLM that you might be using.

As for moving the function, I'll comment on that separately.


// getMatchListener tries to find a Gateway listener that matches the HTTPRoute’s
// namespace and hostname constraints, according to the Gateway API rules.
func (r *HTTPRouteReconciler) getMatchListener(ctx context.Context, httpRoute *v2.HTTPRoute, gw *v2.Gateway) *v2.Listener {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain why would you want to deal with gateways?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I didn’t quite understand your question.

If you mean why the parameter uses gateway instead of gatewaySpec, it’s because in the Gateway API From field
, when From is set to Same, we need to check whether the HTTPRoute and the Gateway are in the same namespace.

If you mean why we need to query the Gateway, it’s because the HTTPRoute alone doesn’t provide enough information to construct an accessible URL.

Copy link
Collaborator

@weisdd weisdd Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's up for a discussion amongst maintainers, but I feel like it's not worth implementing as it requires too much complexity and wider cluster permissions for properly detecting protocol (HTTP/HTTPS).

If this functionality gets accepted, the function will need to be 100% covered by tests and also refactored, otherwise it would be too fragile and complex to maintain. I think the function that you re-used (labelsSatisfyMatchExpressions) presents a good example of how to pack complex logic into a simple form and how to test it afterwards, so you might want to use it as a reference.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this functionality is necessary. Querying the Gateway is not just about detecting the protocol (HTTP/HTTPS) — it’s also important when an HTTPRoute doesn’t specify a hostname.

For example:

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  labels:
    app.kubernetes.io/managed-by: grafana-operator
    dashboards: grafana
  name: grafana-httproute
  namespace: default
spec:
  parentRefs:
  - group: gateway.networking.k8s.io
    kind: Gateway
    name: eg
  rules:
  - backendRefs:
    - group: ""
      kind: Service
      name: grafana-service
      namespace: default
      port: 3000
      weight: 1
    matches:
    - path:
        type: PathPrefix
        value: /

In this case, the Gateway looks like this:

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: eg
  namespace: default
spec:
  gatewayClassName: eg
  listeners:
  - allowedRoutes:
      namespaces:
        from: Same
    name: http
    port: 8080
status:
  addresses:
  - type: IPAddress
    value: 10.10.1.200

The route is accessible directly via http://10.10.1.200:8080 without requiring any hostname. However, this IP address can only be retrieved from the Gateway resource.

❯ curl http://10.10.1.200:8080 -v
*   Trying 10.10.1.200:8080...
* Connected to 10.10.1.200 (10.10.1.200) port 8080
> GET / HTTP/1.1
> Host: 10.10.1.200:8080
> User-Agent: curl/8.7.1
> Accept: */*
>
* Request completely sent off
< HTTP/1.1 302 Found
< cache-control: no-store
< content-type: text/html; charset=utf-8
< location: /login
< x-content-type-options: nosniff
< x-frame-options: deny
< x-xss-protection: 1; mode=block
< date: Fri, 07 Nov 2025 12:47:20 GMT
< content-length: 29
<
<a href="/login">Found</a>.

* Connection #0 to host 10.10.1.200 left intact

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a valid scenario, but I was rather thinking that the whole support for operator <-> grafana interaction through an HTTPRoute should not be implemented as it overcomplicates code for little gains. - If a Grafana instance is in the same cluster, why wouldn't someone allow the operator to communicate with Grafana directly?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if the operator and the instance are not in the same cluster, the operator won’t be able to function properly. I think that’s also the reason why Ingress is handled this way:
https://github.com/grafana/grafana-operator/blob/master/controllers/reconcilers/grafana/ingress_reconciler.go#L83

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an instance is in another cluster, it's a so-called external Grafana. In this case, services, configmaps, ingresses, and many other objects are not reconciled.

Ingresses are special in a sense that they are easy to implement (all information is contained within a single object). And one of the key reasons to support them was to be able to reach an "internal" grafana instance (external instances weren't supported till v5) from the operator running locally (outside of cluster, just in local dev environment - no need to build an image).

Copy link
Author

@hkmdxlftjf hkmdxlftjf Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If operator must be restricted to deployment within a cluster, then I have no other reason.

}

// Assign AdminURL if ingress is preferred
if cr.PreferIngress() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ingress and HTTPRoute are separate resources, so I think there's a conflicting logic here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think PreferIngress is meant to define how the Grafana Operator communicates with the Grafana instance. In my view, Ingress and HTTPRoute serve the same purpose — they just differ in implementation. Therefore, adding another field like preferHTTPRoute might be redundant.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if both Ingress and HTTPRoute are defined, there would be a conflicting back-and-forth change between Ingress and HTTPRoute reconcilers, right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I just noticed the same issue. It’s actually related to OperatorStageHTTPRoute.
I plan to remove this stage and make HTTPRoute take priority when both HTTPRoute and Ingress are defined.

func (r *IngressReconciler) Reconcile(ctx context.Context, cr *v1beta1.Grafana, vars *v1beta1.OperatorReconcileVars, scheme *runtime.Scheme) (v1beta1.OperatorStageStatus, error) {
	log := logf.FromContext(ctx).WithName("IngressReconciler")

	// On OpenShift, fallback to Ingress when spec.route is undefined
	if r.isOpenShift && cr.Spec.Route != nil {
		log.Info("reconciling route")
		return r.reconcileRoute(ctx, cr, vars, scheme)
	}
	if cr.Spec.HTTPRoute != nil {
		log.Info("reconciling http route")
		return r.reconcileHTTPRoute(ctx, cr, vars, scheme)
	}

	log.Info("reconciling ingress")

	return r.reconcileIngress(ctx, cr, vars, scheme)
}


// getMatchListener tries to find a Gateway listener that matches the HTTPRoute’s
// namespace and hostname constraints, according to the Gateway API rules.
func (r *HTTPRouteReconciler) getMatchListener(ctx context.Context, httpRoute *v2.HTTPRoute, gw *v2.Gateway) *v2.Listener {
Copy link
Collaborator

@weisdd weisdd Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's up for a discussion amongst maintainers, but I feel like it's not worth implementing as it requires too much complexity and wider cluster permissions for properly detecting protocol (HTTP/HTTPS).

If this functionality gets accepted, the function will need to be 100% covered by tests and also refactored, otherwise it would be too fragile and complex to maintain. I think the function that you re-used (labelsSatisfyMatchExpressions) presents a good example of how to pack complex logic into a simple form and how to test it afterwards, so you might want to use it as a reference.

@hkmdxlftjf
Copy link
Author

I have removed the OperatorStageHTTPRoute and added unit tests for getMatchListener. #If decide not to keep getMatchListener, I will remove it.

Copy link
Collaborator

@weisdd weisdd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Things that also need to be covered in the PR:

  • the operator must be able to function when the Gateway API CRD(s) is not present;
  • e2e-tests need to be extended to cover cases where the CRD IS NOT present and when it IS present;
  • controller-runtime needs to be configured to cache the HTTPRoute objects (check this for reference).

@hkmdxlftjf
Copy link
Author

Things that also need to be covered in the PR:

  • the operator must be able to function when the Gateway API CRD(s) is not present;
  • e2e-tests need to be extended to cover cases where the CRD IS NOT present and when it IS present;
  • controller-runtime needs to be configured to cache the HTTPRoute objects (check this for reference).

Hi, I’ve implemented these changes based on your suggestions.

In the e2e tests, I added case 17, which covers deploying Grafana when the CRD is not present — this works correctly as long as grafana.spec.httpRoute == nil.
I also added a test for deploying Grafana with the CRD present (but without any actual Gateway implementation), which also works as expected.
Additionally, caching has been configured for the HTTPRoute objects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Issues relating to documentation, missing, non-clear etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants