Skip to content

feat(controllers): optionally do not cache resources created without CommonLabels #1818

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Mar 11, 2025

Conversation

Baarsgaard
Copy link
Collaborator

@Baarsgaard Baarsgaard commented Jan 11, 2025

I read a blog post on operator memory pitfalls mentioning Owns() being a footgun, which is used in the grafana_reconciler SetupWithManager.

TLDR: By declaring Owns() or using Get/List you tell the the controller-runtime to watch and cache all instances of the client.Object, which on large clusters could result in a lot of ConfigMaps, Secrets and Deployments in the Grafana-Operators case.

I expected this to be a problem due to the pprof profiles uploaded in #1622 which was verified by following the steps outlined below.

The post linked to an Operator SDK trick for configuring the client.Object cache with labels.

mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{
  Cache: cache.Options{
    ByObject: map[client.Object]cache.ByObject{
      &corev1.Secret{}: cache.ByObject{
	Label: labels.SelectorFromSet(labels.Set{"app": "app-name"}),
      },
    },
  },
})

I remembered that #1661 added common labels to resources created by the operator to reduce memory consumption.

Verifying cache issues:

  1. Start a local kind cluster with some default resources (oneliner)
make start-kind && \
kind export kubeconfig --name kind-grafana && \
make ko-build-kind && \
IMG=ko.local/grafana/grafana-operator make deploy && \
kubectl patch deploy -n grafana-operator-system grafana-operator-controller-manager-v5  --type='json' -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/imagePullPolicy", "value":"IfNotPresent"}]'
  1. Get a baseline heap reading
kubectl port-forward -n grafana-operator-system deploy/grafana-operator-controller-manager-v5 8888 &
go tool pprof -top -nodecount 20 http://localhost:8888/debug/pprof/heap
  1. Create empty test file: fallocate -l 393216 large_file
  2. Create a couple hundred ConfigMaps
for i in {0..200}; do kubectl create cm test-cm-$i --from-file=./large_file ; done
  1. Get Updated heap
go tool pprof -top -nodecount 20 http://localhost:8888/debug/pprof/heap
# Output on master branch
ile: v5
Type: inuse_space
Time: Jan 11, 2025 at 8:34pm (CET)
Showing nodes accounting for 54.72MB, 100% of 54.72MB total
Showing top 20 nodes out of 107
      flat  flat%   sum%        cum   cum%
   46.91MB 85.72% 85.72%    46.91MB 85.72%  k8s.io/api/core/v1.(*ConfigMap).Unmarshal # <--- this one
       2MB  3.66% 89.38%        2MB  3.66%  runtime.malg
       1MB  1.83% 91.20%        1MB  1.83%  encoding/json.typeFields
    0.75MB  1.37% 92.58%     0.75MB  1.37%  go.uber.org/zap/zapcore.newCounters
    0.54MB  0.99% 93.56%     0.54MB  0.99%  github.com/gogo/protobuf/proto.RegisterType
    0.52MB  0.94% 94.51%     0.52MB  0.94%  k8s.io/apimachinery/pkg/watch.(*Broadcaster).Watch.func1
    0.50MB  0.92% 95.43%     0.50MB  0.92%  unicode.map.init.1
    0.50MB  0.92% 96.34%     0.50MB  0.92%  k8s.io/apimachinery/pkg/runtime.(*Scheme).AddKnownTypeWithName
    0.50MB  0.91% 97.26%     0.50MB  0.91%  github.com/go-openapi/swag.(*indexOfInitialisms).sorted.func1
    0.50MB  0.91% 98.17%     0.50MB  0.91%  go.mongodb.org/mongo-driver/bson/bsoncodec.(*kindDecoderCache).Clone
....

Current progress

Watching and caching has been limited to resources controlled by the operator of Kind:

  • Deployment
  • Ingress
  • Service
  • ServiceAccount
  • PersistentVolumeClaim
  • Route if IsOpenShift

This is done with the existing CommonLabels selector introduced in #1661:
app.kubernetes.io/managed-by: "grafana-operator"

Memory consumption in an empty kind cluster after ~1 minute1:

Change Heap (kb) % reduction2 Note
None (master) 8579.22 0%
Limited resource listed above 5743.40 -33%
Disabling ConfigMaps and Secrets Cache 3585.48 -58% New default

An option to cache ConfigMaps and Secrets has been added

Footnotes

  1. Heap will increase over time as the operator stabilizes.

  2. The reduction is by no means representative of real deployments.
    For clusters mixing the Grafana-Operator and other workloads in cluster scoped mode, the reduction is likely significantly higher.
    Even if the Grafana-Operator was the only Deployment in a cluster, this should still reduce memory as it won't cache itself 😉

@Baarsgaard Baarsgaard changed the title feat(internal): Ignore deployments/Configmaps missing CommonLabels WIP: Ignore deployments/Configmaps missing CommonLabels Jan 11, 2025
@Baarsgaard Baarsgaard force-pushed the reduce_cache_size branch 2 times, most recently from 389e8d6 to e4ed220 Compare January 11, 2025 23:56
@Baarsgaard Baarsgaard changed the title WIP: Ignore deployments/Configmaps missing CommonLabels Fix: Do not cache native resources created without CommonLabels Jan 12, 2025
@Baarsgaard Baarsgaard force-pushed the reduce_cache_size branch 2 times, most recently from 4848451 to 9a9bb49 Compare January 20, 2025 16:47
@Baarsgaard Baarsgaard force-pushed the reduce_cache_size branch 2 times, most recently from d1f1f0b to 78841fb Compare January 21, 2025 19:14
@Baarsgaard Baarsgaard marked this pull request as ready for review January 24, 2025 12:46
@Baarsgaard
Copy link
Collaborator Author

I marked this ready, but I forgot it is blocked by #1833

@Baarsgaard Baarsgaard force-pushed the reduce_cache_size branch 2 times, most recently from 81e6afc to fc47228 Compare January 29, 2025 16:39
@theSuess theSuess added this to the v5.17.0 milestone Feb 4, 2025
@Baarsgaard Baarsgaard force-pushed the reduce_cache_size branch 2 times, most recently from 903567b to 43b2a3e Compare February 22, 2025 21:39
@Baarsgaard
Copy link
Collaborator Author

Baarsgaard commented Feb 22, 2025

Rebased on master and worked it into the additions from #1832.
Should be ready review now!

@Baarsgaard
Copy link
Collaborator Author

Baarsgaard commented Feb 24, 2025

I've added an experimental feature toggle in the form of the EXPERIMENTAL_ENABLE_CACHE_LABEL_LIMITS environment variable.

This ended up bit complex to validate, but I think I managed to cover most of it.

Ensure that the PR works and can be enabled

  1. Get memory baseline

    kubectl port-forward -n grafana-operator-system deploy/grafana-operator-controller-manager-v5 8888 &
    go tool pprof -top -nodecount 20 http://localhost:8888/debug/pprof/heap
  2. Provoke memory climb

    fallocate -l 393216 /tmp/large_file # Create a large file for testing
    for i in {0..200}; do kubectl create cm test-cm-$i -n test --from-file=/tmp/large_file; done
    go tool pprof -top -nodecount 20 http://localhost:8888/debug/pprof/heap
  3. Enable cache limits and see memory decrease

    kubectl patch deploy -n grafana-operator-system grafana-operator-controller-manager-v5  --type='json' -p='[{"op": "add", "path": "/spec/template/spec/containers/0/env/-", "value": {"name": "EXPERIMENTAL_ENABLE_CACHE_LABEL_LIMITS", "value": "1"}}]'
    go tool pprof -top -nodecount 20 http://localhost:8888/debug/pprof/heap

Continue with testing it does not break the watch label selector(sharding)

  1. Enable sharding, memory should reset as ConfigMaps are not labeled

    kubectl patch deploy -n grafana-operator-system grafana-operator-controller-manager-v5  --type='json' -p='[{"op": "add", "path": "/spec/template/spec/containers/0/env/-", "value": {"name": "WATCH_LABEL_SELECTORS", "value": "manual=test"}}]'
    go tool pprof -top -nodecount 20 http://localhost:8888/debug/pprof/heap
  2. Label ConfigMaps and see memory jump

    for i in {0..200}; do kubectl label cm -n test test-cm-$i manual=test &; done
    go tool pprof -top -nodecount 20 http://localhost:8888/debug/pprof/heap
  3. Create Sharded and unsharded grafana instances

    ---
    apiVersion: grafana.integreatly.org/v1beta1
    kind: Grafana
    metadata:
      name: grafana-normal
    spec:
      config:
        log:
          mode: "console"
        auth:
          disable_login_form: "false"
        security:
          admin_user: root
          admin_password: secret
    ---
    apiVersion: grafana.integreatly.org/v1beta1
    kind: Grafana
    metadata:
      name: grafana-shard
      labels:
        manual: test
    spec:
      config:
        log:
          mode: "console"
        auth:
          disable_login_form: "false"
        security:
          admin_user: root
          admin_password: secret
  4. Verify only one instance has a status is reconciled

    kubectl get grafanas -A -o yaml
  5. Manually remove the only the EXPERIMENTAL_ENABLE_CACHE_LABEL_LIMITS env var on the deploy and see memory increase while it's sharded

    kubectl edit deploy -n grafana-operator-system grafana-operator-controller-manager-v5
    go tool pprof -top -nodecount 20 http://localhost:8888/debug/pprof/heap

Every edit of the deployment will require stopping and re-opening the port-forward before the pprof, left out for brevity

@Baarsgaard Baarsgaard requested a review from weisdd March 6, 2025 18:40
main.go Outdated
setupLog.Error(err, fmt.Sprintf("unable to parse %s", watchLabelSelectorsEnvVar))
os.Exit(1) //nolint
// Allow users to enable the above cache limits before a full rollout
if enableCacheLabelLimits == "" {
Copy link
Collaborator

@weisdd weisdd Mar 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Name and idea behind this variable hints that it must be a boolean-like value, thus we should not do empty string comparisons.
  2. The code here states that it would enable caching limits whereas it actually lifts those limits meaning everything will be cached.
  3. I think a simpler way to implement all of this would be to do something like this:
	cacheOptions := cache.Options{
		ByObject: map[client.Object]cache.ByObject{
			&v1.Deployment{}:                cacheByObject,
			&corev1.Service{}:               cacheByObject,
			&corev1.ServiceAccount{}:        cacheByObject,
			&networkingv1.Ingress{}:         cacheByObject,
			&corev1.PersistentVolumeClaim{}: cacheByObject,
			&corev1.ConfigMap{}:             cacheByObject,
			&corev1.Secret{}:                cacheByObject,
		},
	}

        // TODO: Curious what would happen in vanilla k8s if we don't have this check for OpenShift
	if isOpenShift {
		cacheOptions.ByObject[&routev1.Route{}] = cacheByObject
	}

        // I like this name more, it's self-explanatory. By the way, I don't think we have to prefix environment variables with EXPERIMENTAL_, we just need to clarify that in docs / helm / code comments
	if cacheOnlyLabeledResources == "true" {
		controllerOptions.Cache = cacheOptions
	}	

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more point would be:
Should secret / configMap caching be enabled, do we want it to be only for labeled resources? Should we have a separate configuration option that tweaks that behaviour or it's better to not use cacheByObject for them at all? Or should we even go as far as prometheus-operator which rather filters out by secret type. I'm not sure what's the best way to go, just highlighting a concern.

@weisdd
Copy link
Collaborator

weisdd commented Mar 9, 2025

@Baarsgaard I've just modified some of the comments that were added a few minutes ago, so, please, refer to the latest versions. Thx!

@theSuess theSuess force-pushed the reduce_cache_size branch from 2df48b8 to 330af1f Compare March 11, 2025 11:39
@theSuess
Copy link
Collaborator

Refactored this to use an env var with different levels. Also moved the code around a bit to make the opt-in nature more apparent and easier to review.

@weisdd @Baarsgaard if you have a minute, I'd appreciate a re-review as I'm now obviously biased that this is good to merge 😅

@theSuess theSuess requested a review from weisdd March 11, 2025 11:40
@theSuess theSuess force-pushed the reduce_cache_size branch from 330af1f to 06da208 Compare March 11, 2025 11:41
Copy link
Collaborator

@weisdd weisdd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming and code structure are clear now, everything looks good to me :)

@theSuess theSuess added this pull request to the merge queue Mar 11, 2025
Merged via the queue into grafana:master with commit 06be4b3 Mar 11, 2025
15 checks passed
@weisdd weisdd changed the title Fix: Do not cache native resources created without CommonLabels feat(controllers): optionally do not cache resources created without CommonLabels Mar 11, 2025
@weisdd weisdd added the feature this PR introduces a new feature label Mar 11, 2025
@Baarsgaard Baarsgaard deleted the reduce_cache_size branch March 11, 2025 18:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature this PR introduces a new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants