Skip to content

Conversation

@tehbooom
Copy link
Member

Elastic Package Registry (EPR) has been highly requested to be added to ECK.

EPR does not have any references since it does not require a license nor any other application.

The following was implemented for EPR

  • defaults to TLS
  • Sets the default container image to docker.elastic.co/package-registry/distribution
  • Users can set their own images
  • Users can update the config following the reference
  • Kibana can reference the EPR like Elasticsearch and Enterprise Search
  • If Kibana references EPR and TLS is enabled it will populate xpack.fleet.registryUrl and set the environment variable NODE_EXTRA_CA_CERTS to the path of EPR's CA which is mounted
  • If a user provides their own NODE_EXTRA_CA_CERTS with a mount the controller will combine the certs appending the EPR's CA to the users specified CA

This was tested with and without setting NODE_EXTRA_CA_CERTS using the below manifest

apiVersion: epr.k8s.elastic.co/v1alpha1
kind: ElasticPackageRegistry
metadata:
  name: registry
spec:
  version: 9.1.2
  count: 1
  podTemplate:
    spec:
      containers:
      - name: package-registry
        image: docker.elastic.co/package-registry/distribution:lite-9.1.2
---
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: elasticsearch
spec:
  version: 9.1.2
  nodeSets:
  - name: default
    count: 1
---
apiVersion: kibana.k8s.elastic.co/v1
kind: Kibana
metadata:
  name: kibana
spec:
  version: 9.1.2
  count: 1
  elasticsearchRef:
    name: elasticsearch
  packageRegistryRef:
    name: registry
  config:
    telemetry.optIn: false
    xpack.fleet.isAirGapped: true
    xpack.fleet.agents.elasticsearch.hosts: ["https://elasticsearch-es-http.default.svc:9200"]
    xpack.fleet.agents.fleet_server.hosts: ["https://fleet-server-agent-http.default.svc:8220"]
    xpack.fleet.packages:
      - name: system
        version: latest
      - name: elastic_agent
        version: latest
      - name: fleet_server
        version: latest
    xpack.fleet.agentPolicies:
      - name: Fleet Server on ECK policy
        id: eck-fleet-server
        namespace: default
        monitoring_enabled:
          - logs
          - metrics
        unenroll_timeout: 900
        package_policies:
        - name: fleet_server-1
          id: fleet_server-1
          package:
            name: fleet_server
  podTemplate:
    spec:
      containers:
      - name: kibana
        env:
        - name: NODE_EXTRA_CA_CERTS
          value: /custom/user/ca-bundle.crt
        volumeMounts:
        - name: custom-ca
          mountPath: /custom/user
          readOnly: true
      volumes:
      - name: custom-ca
        secret:
          secretName: user-custom-ca-secret
---
apiVersion: v1
kind: Secret
metadata:
  name: user-custom-ca-secret
  namespace: default
type: Opaque
data:
  ca-bundle.crt: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUZtVENDQTRHZ0F3SUJBZ0lVYjVrK2d6V3A5YjljWTV4bkhUcWZNdHFHUXIwd0RRWUpLb1pJaHZjTkFRRUwKQlFBd1hERUxNQWtHQTFVRUJoTUNXRmd4RlRBVEJnTlZCQWNNREVSbFptRjFiSFFnUTJsMGVURWNNQm9HQTFVRQpDZ3dUUkdWbVlYVnNkQ0JEYjIxd1lXNTVJRXgwWkRFWU1CWUdBMVVFQXd3UGRHVnpkQzVsYkdGemRHbGpMbU52Ck1CNFhEVEkxTURneU1ERTRNakl3T0ZvWERUTTFNRGd4T0RFNE1qSXdPRm93WERFTE1Ba0dBMVVFQmhNQ1dGZ3gKRlRBVEJnTlZCQWNNREVSbFptRjFiSFFnUTJsMGVURWNNQm9HQTFVRUNnd1RSR1ZtWVhWc2RDQkRiMjF3WVc1NQpJRXgwWkRFWU1CWUdBMVVFQXd3UGRHVnpkQzVsYkdGemRHbGpMbU52TUlJQ0lqQU5CZ2txaGtpRzl3MEJBUUVGCkFBT0NBZzhBTUlJQ0NnS0NBZ0VBMHljTGVySWR3LzdpbGlKMzVBUEZ4bUx6TFRnNWRhUStWSUttS2lNbStlTTYKanJOY3lnbGphNVFEbHYvMStGUm5hamhrRTBobHoycXEzTjk0U1pYN3M2eHBnQUVzMGVQQ3VaZVBNU2VUYlYyRgp0YlIxNnFuM0JjenVxN3laOXZwdHR3MmJRdkJkY3JzZFU4T2RYUWhGNFd4QUFwODRKYWlMNmkzMlA2K2VPODBwCmh3Z1kwS0F1bzZoZC8zaFpNME14M2MwRmJmU0JHaTUyOHZKODYzUDRXZlEwMWdtUUxVbGl0UlhhTUhiaDRXSm0KOU45c0psUXpnbkNuQjZ6YkZjZ2gweWxrakd0UzBIZEo3eSs3dmE0Q1BqdkxlWGpwTnZuQzRjTmlocnp4Wmw5bQphM0ZVdVpiU0lRekE2ZFlkdkdrT2V3OTJEek1BaTdldU14UDdyYVhRejZmc1N6U1V4N1RjQWl5M2E5VU9Fdi9rCk5NV3VTbDlUMHRRSkhJSzJMc0t0MlVKWVVHWk4wOWU2SUVSTlJOL0FIUjVDbTlhcVQ1Q2ZyQW9JVVhNdUg2S1oKN1JCZFFockRxL2xEQk54bWs5dW44V2lic0NSVnkvVXRJQ3lOSytxbGpGUWZEd01hNkRkd3BjcnpnTWZnU3RTawpLek1LRUJla2N0Q0Q4dHNmTjZYem5USmNBYUJETzFlQWZyT0Z2NG1PTXJqVG90OEYvK3pxN0dXNTlqWTRvdFhMCkY3TnpadFl0eWsvbDRvb2hUZUFuM1ptd1BDMGJFQ1FkTmpTVkZ6ZXJCamE4ZjhacGpKRzNjUllyVmh6YUNsRWMKRU5wbFRHcldVaUVwRDdnTnNlNWNDSnZpQU12NHdwait2QTVVNlA3Z0MxUUtKV2hWS3BVYWcvTmtTSUFCRmtrQwpBd0VBQWFOVE1GRXdIUVlEVlIwT0JCWUVGTWdldEVJajZtRWdsZURGNkVNdUY4NXVnYzdZTUI4R0ExVWRJd1FZCk1CYUFGTWdldEVJajZtRWdsZURGNkVNdUY4NXVnYzdZTUE4R0ExVWRFd0VCL3dRRk1BTUJBZjh3RFFZSktvWkkKaHZjTkFRRUxCUUFEZ2dJQkFEOFU3dm1yWmhHTUZiV2YzRDZlNy84TUwzWEhLRk5TNy9UeWF3U2tvdGVSTVdFbgp1RWhQK2dmbkdUT2ZITFlQeHl5eEJ4U041T29sZHRJclo5dnhBc2dlYWJzSkJaenhQVHpxU09VN3h3b09LcTlRCmdKRUYxL0ZmemFlR1V5dVE2S1ZaZ0QvZ1JPSW42Ri9OUGlzM1pvbUpPOStuVWdTTnNiUm9RYmdPUGdPV3Q3Z1gKVEhuOHJpdUp2OXRPNFBRN09Sa3pubDJYbERlcE9xNVpwSUtkcVl0Rm5MUjF3SllyREZESmt0Q3h6MzFob0FrZwpSVjlSU1BSMFFxZ1JQeFNpNGpXdkNGUk5XTUFJc0NadGJsWExRRUljWGI1YnlsWXV2a3psTTJ4dHlHK3FaRFhMCnFoZDVNeFZIUkpqTzE1VEdpZXFRcUpMVkZyVElhTHFoaXZpQ1pUbDJoVkYxVlpPVG05MU5aeE53M25RL3JyeDgKK2VQV2xTWlZKWXc3SDRkWkx5WTFjRUxLT0YrZDJybVNSZ2pWaHZycUZ3R1M3MUQzYkV4Y0dSakNrOHNQWEZyRwpsOFRzY05RMXBPSGVuNlJhOFhVdGtxU1doZllFb3owZjBEem4wYmt4c2VWaCttS1BHV3QxcHdlemVFTFVwaHE3CmwwSVRLeis1b1lqYWVHTDRia25kcWlpemwzWkc2N0lYL3VyR0dQVUxkLzU1NEtRMFFPMS92S3Y2dE1YMWc0dVMKWHdWc0pzQjlrTUIwRFFxbDhRYmg0UEJ2ZW9RRTZvL3BycXRtWjR1RWdDMCt1cm5paDlCY1FweFNKOUljR1kxTQpBQzRBcG5Pem1CYTFhUVBMcDRaRFIxQXpFK1hXWDd2WWNWYUxleUJxRzRja3dwbUtOUnhpcnJjS2NaMkYKLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
---
apiVersion: agent.k8s.elastic.co/v1alpha1
kind: Agent
metadata:
  name: fleet-server
spec:
  version: 9.1.2
  kibanaRef:
    name: kibana
  elasticsearchRefs:
  - name: elasticsearch
  mode: fleet
  fleetServerEnabled: true
  policyID: eck-fleet-server
  deployment:
    replicas: 1
    podTemplate:
      spec:
        serviceAccountName: fleet-server
        automountServiceAccountToken: true
        resources:
          requests:
            cpu: 200m
            memory: 1Gi
          limits:
            cpu: 1
            memory: 2Gi
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: fleet-server
  namespace: default
rules:
- apiGroups: [""]
  resources:
  - pods
  - namespaces
  - nodes
  verbs:
  - get
  - watch
  - list
- apiGroups: ["apps"]
  resources:
    - replicasets
  verbs:
    - get
    - watch
    - list
- apiGroups: ["batch"]
  resources:
    - jobs
  verbs:
    - get
    - watch
    - list
- apiGroups: ["coordination.k8s.io"]
  resources:
  - leases
  verbs:
  - get
  - create
  - update
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: fleet-server
  namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: fleet-server
  namespace: default
subjects:
- kind: ServiceAccount
  name: fleet-server
  namespace: default
roleRef:
  kind: ClusterRole
  name: fleet-server
  apiGroup: rbac.authorization.k8s.io

@prodsecmachine
Copy link
Collaborator

prodsecmachine commented Aug 20, 2025

Snyk checks have passed. No issues have been found so far.

Status Scanner Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues
Licenses 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

@github-actions
Copy link

github-actions bot commented Aug 20, 2025

🔍 Preview links for changed docs

Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a quick look from the side of the team maintaining Package Registry.

It looks great, thanks for adding support for package registry in ECK, this will help many users.

Added some comments, please let us know if you need a more in-depth review from our side.

AgentImage Image = "elastic-agent/elastic-agent"
MapsImage Image = "elastic-maps-service/elastic-maps-server"
LogstashImage Image = "logstash/logstash"
PackageRegistryImage Image = "package-registry/distribution"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an image used by default, or setting the image is required?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the default version is package-registry/distribution:<version> where the user specifies the version. Version is a required field so if they do not specify one it will fail. Something I thought about was adding a epr_type or something like that where the user could specify the different EPR versions we publish. Like 9.1.2, lite-9.1.2, production, and lite. However I think its just as easy to specify the image in the template if you want something other than package-registry/distribution:<version>

Copy link
Contributor

@barkbay barkbay Aug 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While doing a first test I just realized that the default image is around ~5Gb 😅 (I was wondering why my container was not starting 🙃 ). I'm also wondering if it would not make sense to have short flag to select the image "type".

Edit: I just had a Pod that failed to start with the following error on GKE:

  Warning  Failed               90s    kubelet            Failed to pull image "docker.elastic.co/package-registry/distribution:9.1.0": failed to pull and unpack image "docker.elastic.co/package-registry/distribution:9.1.0": failed to extract layer sha256:0f0888ef6ac576c67e3a9acf8ec7216533b7f3144aeb14c9b93d0db9469830cd: write /var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/889/fs/packages/package-storage/security_detection_engine-9.0.9-beta.1.zip: no space left on device: unknown

I also had disk pressure conditions. I think the image size should be highlighted in the documentation so that K8s nodes can handle it.

Edit 2: This maybe also means that we need to check the disk size on the nodes used for our e2e tests

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsoriano do you think we should change the default image to be something smaller? I selected this image because it is what we recommend.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the lite images were added as smaller images for this kind of use cases, but even this image is starting to be too big.

There are vanilla images that don't contain any package. They fail to start if no directory with packages is configured, but they can also be configured in proxy mode, to forward requests for example to the public EPR.

Latest of these images is docker.elastic.co/package-registry/package-registry:v1.31.1.

There is an open issue about allowing to start even if no package is available yet: elastic/package-registry#1351.

More about the proxy mode in https://github.com/elastic/package-registry/?tab=readme-ov-file#proxy-mode.

We also have a WIP to create custom distributions, and smaller images elastic/package-registry#1335.

@pebrc pebrc requested a review from Copilot August 25, 2025 15:17

This comment was marked as outdated.

@naemono naemono requested a review from Copilot August 25, 2025 18:39
@naemono naemono added >enhancement Enhancement of existing functionality discuss We need to figure this out labels Aug 25, 2025
@botelastic botelastic bot removed the triage label Aug 25, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds Elastic Package Registry (EPR) support to ECK, introducing a new CRD for deploying EPR instances and enabling Kibana to reference EPR instances for Fleet package management.

  • Adds ElasticPackageRegistry CRD with controller to manage EPR deployments
  • Enables Kibana to associate with EPR instances via packageRegistryRef field
  • Implements TLS certificate handling and CA mounting for secure communication between Kibana and EPR

Reviewed Changes

Copilot reviewed 60 out of 61 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
pkg/apis/epr/v1alpha1/ New API definitions for ElasticPackageRegistry CRD
pkg/controller/packageregistry/ Controller implementation for managing EPR resources
pkg/controller/association/controller/kibana_epr.go Association controller for Kibana-EPR relationships
pkg/apis/kibana/v1/kibana_types.go Adds packageRegistryRef field and EPR association support
pkg/controller/kibana/ Updates Kibana controller to handle EPR associations and CA certificates
test/e2e/ E2E tests for EPR functionality and associations
Comments suppressed due to low confidence (1)

pkg/controller/kibana/pod_test.go:1

  • The comment on line 67 says 'readinessProbe is the readiness probe for the maps container' but this function is in the packageregistry controller and should refer to the package registry container.
// Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@barkbay barkbay added >feature Adds or discusses adding a feature to the product and removed >enhancement Enhancement of existing functionality labels Aug 26, 2025
@naemono
Copy link
Contributor

naemono commented Nov 5, 2025

Main blocker to merge this imo is the lack of UBI images for the package registry.

This blocker has been addressed in elastic/package-registry#1451, which is now merged.

@jsoriano
Copy link
Member

jsoriano commented Nov 5, 2025

Main blocker to merge this imo is the lack of UBI images for the package registry.

This blocker has been addressed in elastic/package-registry#1451, which is now merged.

In this PR we are using the Package Registry distribution images. To support UBI there we would also need to update https://github.com/elastic/package-storage-infra/blob/13bf4e9ba03c028b16ed37772cd0d1afaa45af4f/.buildkite/scripts/build_distributions.sh.

Copy link
Contributor

@naemono naemono left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that we will take ownership of this PR and move it towards a merging state. Review notes are primarily for our own purposes.

name: registry
spec:
version: 9.1.2
count: 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we note here links to the lite container instead?

EPRContainerName = "package-registry"
// Kind is inferred from the struct name using reflection in SchemeBuilder.Register()
// we duplicate it as a constant here for practical purposes.
Kind = "ElasticPackageRegistry"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#8905 is in place for tracking of these additions.

func (m *ElasticPackageRegistry) GetIdentityLabels() map[string]string {
return map[string]string{
commonv1.TypeLabelName: "epr",
"packageregistry.k8s.elastic.co/name": m.Name,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the groupVersion epr.k8s.elastic.co, but the label is packageregistry.k8s.elastic.co?

)

const (
HTTPPort = 8080
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: Can specifying the spec.http settings cause issues because this is const?

Remove credentials label from secret.
review comments.

Signed-off-by: Michael Montgomery <[email protected]>
@naemono
Copy link
Contributor

naemono commented Nov 13, 2025

Main blocker to merge this imo is the lack of UBI images for the package registry.

This blocker has been addressed in elastic/package-registry#1451, which is now merged.

In this PR we are using the Package Registry distribution images. To support UBI there we would also need to update https://github.com/elastic/package-storage-infra/blob/13bf4e9ba03c028b16ed37772cd0d1afaa45af4f/.buildkite/scripts/build_distributions.sh.

The beginnings of this needed PR are here @jsoriano

Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
@naemono
Copy link
Contributor

naemono commented Nov 20, 2025

Definitely seeing some issues testing on openshift, but it doesn't seem like it's ocp specific:

{"log.level":"error","@timestamp":"2025-11-20T17:44:02.025Z","log.logger":"manager.eck-operator","message":"Reconciler error","service.version":"9.3.0-SNAPSHOT+","service.type":"eck","ecs.version":"1.4.0","controller":"packageregistry-controller","object":{"name":"registry","namespace":"elastic"},"namespace":"elastic","name":"registry","reconcileID":"8a30e396-91e2-4a86-a9b3-79368a4032a6","error":"services \"registry-epr-http\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>","errorCauses":[{"error":"services \"registry-epr-http\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>"}],"error.stack_trace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler\n\t/root/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:474\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem\n\t/root/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:421\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func1.1\n\t/root/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:296"}

@naemono
Copy link
Contributor

naemono commented Nov 20, 2025

buildkite test this -f p=gke,E2E_TAGS=epr

@naemono
Copy link
Contributor

naemono commented Nov 20, 2025

Definitely seeing some issues testing on openshift, but it doesn't seem like it's ocp specific:

{"log.level":"error","@timestamp":"2025-11-20T17:44:02.025Z","log.logger":"manager.eck-operator","message":"Reconciler error","service.version":"9.3.0-SNAPSHOT+","service.type":"eck","ecs.version":"1.4.0","controller":"packageregistry-controller","object":{"name":"registry","namespace":"elastic"},"namespace":"elastic","name":"registry","reconcileID":"8a30e396-91e2-4a86-a9b3-79368a4032a6","error":"services \"registry-epr-http\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>","errorCauses":[{"error":"services \"registry-epr-http\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>"}],"error.stack_trace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler\n\t/root/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:474\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem\n\t/root/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:421\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func1.1\n\t/root/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:296"}

Nope, it's ocp specific: 49b1e56. (was missing packageregistries/finalizers RBAC permissions)

@naemono
Copy link
Contributor

naemono commented Nov 20, 2025

Nope, it's ocp specific: 49b1e56. (was missing packageregistries/finalizers RBAC permissions)

And more fun on ocp:

{"log.level":"info","@timestamp":"2025-11-20T20:49:24.212Z","log.logger":"manager.eck-operator","message":"would violate PodSecurity \"restricted:latest\": runAsNonRoot != true (pod or container \"package-registry\" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container \"package-registry\" must set securityContext.seccompProfile.type to \"RuntimeDefault\" or \"Localhost\")","service.version":"3.3.0-rc1-SNAPSHOT+","service.type":"eck","ecs.version":"1.4.0","controller":"packageregistry-controller","object":{"name":"registry","namespace":"elastic"},"namespace":"elastic","name":"registry","reconcileID":"2521d32a-cdc5-4c36-ba90-64011f78d67b"}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

discuss We need to figure this out >feature Adds or discusses adding a feature to the product release-highlight Candidate for the ECK release highlight summary

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants