Skip to content
This repository was archived by the owner on Sep 7, 2023. It is now read-only.
This repository was archived by the owner on Sep 7, 2023. It is now read-only.

Consul image can not be started on Kubernetes/Openshift without mounted volume #184

@fedinskiy

Description

@fedinskiy

Overview of the Issue

When official Consul docker image is started on Kubernetes without mounted volume, it fails with either su-exec: setgroups(1000): Operation not permitted or failed to write NodeID to disk error.

Reproduction Steps

Steps for Openshift, steps for K8s should be similar:

  1. Login into OpenShift
  2. Create new project: oc new-project ts-consul
  3. Create file consul.yml with following content:
---
apiVersion: "v1"
kind: "List"
items:
- apiVersion: "v1"
  kind: "Service"
  metadata:
    labels:
      scenarioId: "OpenShiftConsulConfigSourceIT-1651150347701"
    name: "consul"
    namespace: "ts-consul"
  spec:
    ports:
    - name: "http"
      port: 8500
      targetPort: 8500
    selector:
      deploymentconfig: "consul"
    type: "ClusterIP"
- apiVersion: "apps.openshift.io/v1"
  kind: "DeploymentConfig"
  metadata:
    labels:
      scenarioId: "OpenShiftConsulConfigSourceIT-1651150347701"
    name: "consul"
    namespace: "ts-consul"
  spec:
    replicas: 1
    selector:
      deploymentconfig: "consul"
    template:
      metadata:
        labels:
          deploymentconfig: "consul"
          tsLogWatch: "consul"
          scenarioId: "OpenShiftConsulConfigSourceIT-1651150347701"
        namespace: "ts-consul"
      spec:
        - image: "consul:1.11"
# Uncomment these lines for a different error:
#          env:
#          - name: "CONSUL_DISABLE_PERM_MGMT"
#            value: "yes"
          imagePullPolicy: "IfNotPresent"
          name: "consul"
          ports:
          - containerPort: 8500
            name: "http"
            protocol: "TCP"
    triggers:
    - type: "ConfigChange"
  1. Deploy the container: oc apply -f consul.yml -n ts-consul
  2. Start the container: oc scale dc/consul --replicas=1 -n ts-consul
  3. Wait for several seconds and check status: oc status -n ts-consul
Errors:
  * pod/consul-1-8lmhh is crash-looping
  1. Check pod logs: oc logs pod/consul-1-8lmhh(replace with the id of your pod)
su-exec: setgroups(1000): Operation not permitted

Alternative solution

We can follow the solution, implemented in #103 and add CONSUL_DISABLE_PERM_MGMT property. Unfortunately, this will just lead to a different error:

 failed to setup node ID: failed to write NodeID to disk: open /consul/data/node-id: permission denied

Operating system and Environment details

OS: Linux 5.16.20-200.fc35.x86_64
OpenShift:

# Client
oc v3.11.420
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server 
kubernetes v1.23.5+9ce5071

Additional info

Similar error was previously described several times: [1](suggested solution is to use custom Docker image), [2](added CONSUL_DISABLE_PERM_MGMT environment property, not helpful in this case, see "Alternative solution" section and [3] (recommended solution is to check "mount parameters"), but current solution requires volume mounting, which would be overkill in some cases(e.g/ training or integration testing). Usage of bitnami/consul image can be considered a workaround, but it comes with its own challenges[4] so it is preferable to have this issue solved for the official image.

[1] hashicorp/consul#4172
[2] #103
[3] hashicorp/consul#10403
[4] bitnami-labs/sealed-secrets#822

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions