Allow for scoping TA to watch namespace

CharlieTLe · CharlieTLe · commit ecd4b2ba6881 · 2025-03-03T14:35:27.000-08:00
It's currently not possible to deploy the TA without cluster-wide permisisons. This change introduces a new env variable to the TA, WATCH_NAMESPACE, which allows for specifying which namespaces to watch. This approach is similar to how the opentelemetry-operator can be scoped to watch a single namespace. This does mean that cluster-wide resource like node metrics (cAdvisor) are no longer accessible, but this is acceptable since we only want the TA to know about targets that exist a specific namespaces. Fixes: #3086 Signed-off-by: Charlie Le <charlie_le@apple.com>
diff --git a/.chloggen/namespace-ta.yaml b/.chloggen/namespace-ta.yaml
@@ -0,0 +1,18 @@
+# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
+change_type: enhancement
+
+# The name of the component, or a single word describing the area of concern, (e.g. collector, target allocator, auto-instrumentation, opamp, github action)
+component: target allocator
+
+# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
+note: |
+  Add support for `WATCH_NAMESPACE` environment variable in the target allocator.
+
+# One or more tracking issues related to the change
+issues: [3086]
+
+# (Optional) One or more lines of additional information to render under the primary note.
+# These lines will be padded with 2 spaces and then inserted directly into the document.
+# Use pipe (|) for multiline entries.
+subtext: |
+  This variable can be set to an empty string to watch all namespaces, or to a comma-separated list of namespaces to watch.
diff --git a/cmd/otel-allocator/README.md b/cmd/otel-allocator/README.md
@@ -180,9 +180,11 @@ Upstream documentation here: [PrometheusReceiver](https://github.com/open-teleme
 
 ### RBAC
 
-Before the TargetAllocator can start scraping, you need to set up Kubernetes RBAC (role-based access controls) resources. This means that you need to have a `ServiceAccount` and corresponding cluster roles so that the TargetAllocator has access to all of the necessary resources to pull metrics from.
+Before the TargetAllocator can start scraping, you need to set up Kubernetes RBAC (role-based access controls) resources. This means that you need to have a `ServiceAccount` and corresponding ClusterRoles/Roles so that the TargetAllocator has access to all the necessary resources to pull metrics from.
 
-You can create your own `ServiceAccount`, and reference it in `spec.targetAllocator.serviceAccount` in your `OpenTelemetryCollector` CR. You’ll then need to configure the `ClusterRole` and `ClusterRoleBinding` for this `ServiceAccount`, as per below.
+You can create your own `ServiceAccount`, and reference it in `spec.targetAllocator.serviceAccount` in your `OpenTelemetryCollector` CR. You’ll then need to configure the `ClusterRole` and `ClusterRoleBinding` or `Role` and `RoleBinding` for this `ServiceAccount`, as per below.
+
+#### Cluster-scoped RBAC
 
 ```yaml
   targetAllocator:
@@ -193,11 +195,11 @@ You can create your own `ServiceAccount`, and reference it in `spec.targetAlloca
 ```
 
 > 🚨 **Note**: The Collector part of this same CR *also* has a serviceAccount key which only affects the collector and *not*
-the TargetAllocator.
+> the TargetAllocator.
 
-If you omit the `ServiceAccount` name, the TargetAllocator creates a `ServiceAccount` for you. The `ServiceAccount`’s default name is a concatenation of the Collector name and the `-targetallocator` suffix. By default, this `ServiceAccount` has no defined policy, so you’ll need to create your own `ClusterRole` and `ClusterRoleBinding` for it, as per below.
+If you omit the `ServiceAccount` name, the TargetAllocator creates a `ServiceAccount` for you. The `ServiceAccount`’s default name is a concatenation of the Collector name and the `-targetallocator` suffix. By default, this `ServiceAccount` has no defined policy, so you’ll need to create your own `ClusterRole` and `ClusterRoleBinding` or `Role` and `RoleBinding` for it, as per below.
 
-The role below will provide the minimum access required for the Target Allocator to query all the targets it needs based on any Prometheus configurations:
+The ClusterRole below will provide the minimum access required for the Target Allocator to query all the targets it needs based on any Prometheus configurations:
 
 ```yaml
 apiVersion: rbac.authorization.k8s.io/v1
@@ -231,7 +233,7 @@ rules:
   verbs: ["get"]
 ```
 
-If you enable the the `prometheusCR` (set `spec.targetAllocator.prometheusCR.enabled` to `true`) in the `OpenTelemetryCollector` CR, you will also need to define the following roles. These give the TargetAllocator access to the `PodMonitor` and `ServiceMonitor` CRs. It also gives namespace access to the `PodMonitor` and `ServiceMonitor`.
+If you enable the `prometheusCR` (set `spec.targetAllocator.prometheusCR.enabled` to `true`) in the `OpenTelemetryCollector` CR, you will also need to define the following ClusterRoles. These give the TargetAllocator access to the `PodMonitor` and `ServiceMonitor` CRs. It also gives namespace access to the `PodMonitor` and `ServiceMonitor`.
 
 ```yaml
 apiVersion: rbac.authorization.k8s.io/v1
@@ -252,8 +254,83 @@ rules:
   verbs: ["get", "list", "watch"]
 ```
 
-> ✨ The above roles can be combined into a single role.
+> ✨ The above ClusterRoles can be combined into a single ClusterRole.
+ 
+#### Namespace-scoped RBAC
+
+If you want to have the TargetAllocator watch a specific namespace, you can set the WATCH_NAMESPACE environment variable
+in the TargetAllocator's deployment. This is useful if you want to restrict the TargetAllocator to only watch Prometheus
+CRs in a specific namespace, and not have cluster-wide access.
+
+```yaml
+  targetAllocator:
+    enabled: true
+    serviceAccount: opentelemetry-targetallocator-sa
+    prometheusCR:
+      enabled: true
+    env:
+      - name: WATCH_NAMESPACE
+        value: "foo"
+```
+
+In this case, you will need to create a Role and RoleBinding instead of a ClusterRole and ClusterRoleBinding. The Role
+and RoleBinding should be created in the namespace specified in the WATCH_NAMESPACE environment variable.
 
+```yaml
+apiVersion: rbac.authorization.k8s.io/v1
+kind: Role
+metadata:
+  name: opentelemetry-targetallocator-role
+rules:
+  - apiGroups:
+      - ""
+    resources:
+      - pods
+      - services
+      - endpoints
+      - configmaps
+      - secrets
+      - namespaces
+    verbs:
+      - get
+      - watch
+      - list
+  - apiGroups:
+      - apps
+    resources:
+      - statefulsets
+    verbs:
+      - get
+      - watch
+      - list
+  - apiGroups:
+      - discovery.k8s.io
+    resources:
+      - endpointslices
+    verbs:
+      - get
+      - watch
+      - list
+  - apiGroups:
+      - networking.k8s.io
+    resources:
+      - ingresses
+    verbs:
+      - get
+      - watch
+      - list
+  - apiGroups:
+      - monitoring.coreos.com
+    resources:
+      - servicemonitors
+      - podmonitors
+      - scrapeconfigs
+      - probes
+    verbs:
+      - get
+      - watch
+      - list
+```
 
 ### Service / Pod monitor endpoint credentials
 
@@ -409,4 +486,3 @@ Shards the received targets based on the discovered Collector instances
 
 ### Collector
 Client to watch for deployed Collector instances which will then provided to the Allocator. 
-
diff --git a/cmd/otel-allocator/internal/watcher/promOperator.go b/cmd/otel-allocator/internal/watcher/promOperator.go
@@ -8,6 +8,7 @@ import (
 	"fmt"
 	"log/slog"
 	"os"
+	"strings"
 	"time"
 
 	"github.com/blang/semver/v4"
@@ -53,7 +54,21 @@ func NewPrometheusCRWatcher(ctx context.Context, logger logr.Logger, cfg allocat
 		return nil, err
 	}
 
-	factory := informers.NewMonitoringInformerFactories(map[string]struct{}{v1.NamespaceAll: {}}, map[string]struct{}{}, mClient, allocatorconfig.DefaultResyncTime, nil) //TODO decide what strategy to use regarding namespaces
+	// Check env var for WATCH_NAMESPACE and use it if its set, else use v1.NamespaceAll
+	// This is to allow the operator to watch only a specific namespace
+	watchNamespace, found := os.LookupEnv("WATCH_NAMESPACE")
+	allowList := map[string]struct{}{}
+	if found {
+		logger.Info("watching namespace(s)", "namespaces", watchNamespace)
+		for _, ns := range strings.Split(watchNamespace, ",") {
+			allowList[ns] = struct{}{}
+		}
+	} else {
+		allowList = map[string]struct{}{v1.NamespaceAll: {}}
+		logger.Info("the env var WATCH_NAMESPACE isn't set, watching all namespaces")
+	}
+
+	factory := informers.NewMonitoringInformerFactories(allowList, map[string]struct{}{}, mClient, allocatorconfig.DefaultResyncTime, nil) //TODO decide what strategy to use regarding namespaces
 
 	monitoringInformers, err := getInformers(factory)
 	if err != nil {
@@ -99,7 +114,7 @@ func NewPrometheusCRWatcher(ctx context.Context, logger logr.Logger, cfg allocat
 			logger.Error(err, "Retrying namespace informer creation in promOperator CRD watcher")
 			return true
 		}, func() error {
-			nsMonInf, err = getNamespaceInformer(ctx, map[string]struct{}{v1.NamespaceAll: {}}, promLogger, clientset, operatorMetrics)
+			nsMonInf, err = getNamespaceInformer(ctx, allowList, promLogger, clientset, operatorMetrics)
 			return err
 		})
 	if getNamespaceInformerErr != nil {
diff --git a/tests/e2e-targetallocator/targetallocator-namespace/assert-jobs-succeeded.yaml b/tests/e2e-targetallocator/targetallocator-namespace/assert-jobs-succeeded.yaml
@@ -0,0 +1,20 @@
+apiVersion: batch/v1
+kind: Job
+metadata:
+  name: check-metrics
+status:
+  succeeded: 1
+---
+apiVersion: batch/v1
+kind: Job
+metadata:
+  name: check-ta-jobs
+status:
+  succeeded: 1
+---
+apiVersion: batch/v1
+kind: Job
+metadata:
+  name: check-ta-scrape-configs
+status:
+  succeeded: 1
diff --git a/tests/e2e-targetallocator/targetallocator-namespace/assert-workloads-ready.yaml b/tests/e2e-targetallocator/targetallocator-namespace/assert-workloads-ready.yaml
@@ -0,0 +1,15 @@
+apiVersion: apps/v1
+kind: StatefulSet
+metadata:
+  name: prometheus-cr-collector
+status:
+  readyReplicas: 1
+  replicas: 1
+---
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: cr-targetallocator
+status:
+  readyReplicas: 1
+  replicas: 1
diff --git a/tests/e2e-targetallocator/targetallocator-namespace/chainsaw-test.yaml b/tests/e2e-targetallocator/targetallocator-namespace/chainsaw-test.yaml
@@ -0,0 +1,21 @@
+# yaml-language-server: $schema=https://raw.githubusercontent.com/kyverno/chainsaw/main/.schemas/json/test-chainsaw-v1alpha1.json
+apiVersion: chainsaw.kyverno.io/v1alpha1
+kind: Test
+metadata:
+  name: targetallocator-namespace
+spec:
+  steps:
+  - try:
+    - apply:
+        file: resources/rbac.yaml
+    - apply:
+        file: resources/otelcol.yaml
+    - assert:
+        file: assert-workloads-ready.yaml
+    - apply:
+        file: resources/jobs.yaml
+    - assert:
+        file: assert-jobs-succeeded.yaml
+    catch:
+    - podLogs:
+        selector: app.kubernetes.io/managed-by=opentelemetry-operator
diff --git a/tests/e2e-targetallocator/targetallocator-namespace/resources/jobs.yaml b/tests/e2e-targetallocator/targetallocator-namespace/resources/jobs.yaml
@@ -0,0 +1,52 @@
+apiVersion: batch/v1
+kind: Job
+metadata:
+  name: check-metrics
+spec:
+  template:
+    spec:
+      restartPolicy: OnFailure
+      containers:
+        - name: check-metrics
+          image: curlimages/curl
+          args:
+            - /bin/sh
+            - -c
+            - |
+              for i in $(seq 30); do
+                if curl -m 1 -s http://prometheus-cr-collector:9090/metrics | grep "otelcol"; then exit 0; fi
+                sleep 5
+              done
+              exit 1
+---
+apiVersion: batch/v1
+kind: Job
+metadata:
+  name: check-ta-jobs
+spec:
+  template:
+    spec:
+      restartPolicy: OnFailure
+      containers:
+        - name: check-metrics
+          image: curlimages/curl
+          args:
+            - /bin/sh
+            - -c
+            - curl -s http://cr-targetallocator/scrape_configs | grep "prometheus-cr"
+---
+apiVersion: batch/v1
+kind: Job
+metadata:
+  name: check-ta-scrape-configs
+spec:
+  template:
+    spec:
+      restartPolicy: OnFailure
+      containers:
+        - name: check-metrics
+          image: curlimages/curl
+          args:
+            - /bin/sh
+            - -c
+            - curl -s http://cr-targetallocator/jobs | grep "prometheus-cr"
diff --git a/tests/e2e-targetallocator/targetallocator-namespace/resources/otelcol.yaml b/tests/e2e-targetallocator/targetallocator-namespace/resources/otelcol.yaml
@@ -0,0 +1,57 @@
+apiVersion: opentelemetry.io/v1alpha1
+kind: TargetAllocator
+metadata:
+  name: cr
+spec:
+  args:
+    "zap-log-level": "debug"
+  prometheusCR:
+    enabled: true
+    scrapeInterval: 1s
+    scrapeConfigSelector: {}
+    probeSelector: {}
+    serviceMonitorSelector: {}
+    podMonitorSelector: {}
+  observability:
+    metrics:
+      disablePrometheusAnnotations: true
+      enableMetrics: true
+  env:
+    - name: WATCH_NAMESPACE
+      value: "($namespace)"
+  serviceAccount: ta
+---
+apiVersion: opentelemetry.io/v1beta1
+kind: OpenTelemetryCollector
+metadata:
+  name: prometheus-cr
+  labels:
+    opentelemetry.io/target-allocator: cr
+spec:
+  observability:
+    metrics:
+      disablePrometheusAnnotations: true
+      enableMetrics: true
+  config:
+    receivers:
+      prometheus:
+        config:
+          scrape_configs: []
+
+    processors:
+
+    exporters:
+      prometheus:
+        endpoint: 0.0.0.0:9090
+    service:
+      pipelines:
+        metrics:
+          receivers: [prometheus]
+          exporters: [prometheus]
+      telemetry:
+        logs:
+          level: "DEBUG"
+          development: true
+          encoding: "json"
+  mode: statefulset
+  serviceAccount: collector
diff --git a/tests/e2e-targetallocator/targetallocator-namespace/resources/rbac.yaml b/tests/e2e-targetallocator/targetallocator-namespace/resources/rbac.yaml