Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The kube-flannel pod on the worker node stays in CrashLoopBackOff status. #2069

Open
lengcangche-gituhub opened this issue Sep 28, 2024 · 5 comments

Comments

@lengcangche-gituhub
Copy link

lengcangche-gituhub commented Sep 28, 2024

I use kubeadm join to add a worker node. However, the flannel pod on the worker node stays in CrashLoopBackOff status.

Expected Behavior

The flannel pod becomes Running

Current Behavior

The flannel pod on the worker node stays in CrashLoopBackOff status

Steps to Reproduce (for bugs)

1.On the master node:

root@NPU-Atlas-2:/home/lincom# kubeadm init --pod-network-cidr=100.100.0.0/16 --image-repository=registry.aliyuncs.com/google_containers --apiserver-advertise-address=192.168.1.122
root@NPU-Atlas-2:/home/lincom# mkdir -p $HOME/.kube
root@NPU-Atlas-2:/home/lincom# sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
root@NPU-Atlas-2:/home/lincom# sudo chown $(id -u):$(id -g) $HOME/.kube/config
root@NPU-Atlas-2:/home/lincom# kubectl apply -f kube-flannel.yml

2.On the worker node:

kubeadm join 192.168.1.122:6443 --token 2ydxw7.y64x3rl3d2g4fsxh --discovery-token-ca-cert-hash sha256:9e3a2259e1c0d2a3bf0abcd6e344c5f65c7324cb58900d251f2305d4d16e7273

3.On the master node:

root@NPU-Atlas-2:/home/lincom# kubectl get pods --all-namespaces
NAMESPACE      NAME                                   READY   STATUS              RESTARTS        AGE
default        kubernetes-bootcamp-666cf565fc-97sbb   0/1     ContainerCreating   0               4m39s
kube-flannel   kube-flannel-ds-2tszk                  0/1     CrashLoopBackOff    5 (2m44s ago)   5m46s
kube-flannel   kube-flannel-ds-mkhst                  1/1     Running             0               9m54s
...................
root@NPU-Atlas-2:/home/lincom# kubectl describe pods/kube-flannel-ds-2tszk -n kube-flannel
.............................
Events:
  Type     Reason   Age                    From     Message

  Warning  BackOff  32m (x416 over 122m)   kubelet  Back-off restarting failed container kube-flannel in pod kube-flannel-ds-2tszk_kube-flannel(b97155bd-b848-4272-88d4-0e5fa2f89706)
  Normal   Pulled   28m                    kubelet  Container image "swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/flannel/flannel-cni-plugin:v1.4.1-flannel1-linuxarm64" already present on machine
  Normal   Created  28m                    kubelet  Created container install-cni-plugin
  Normal   Started  28m                    kubelet  Started container install-cni-plugin
  Normal   Pulled   28m                    kubelet  Container image "swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/flannel/flannel:v0.25.1-linuxarm64" already present on machine
  Normal   Created  28m                    kubelet  Created container install-cni
  Normal   Started  28m                    kubelet  Started container install-cni
  Normal   Pulled   27m (x4 over 28m)      kubelet  Container image "swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/flannel/flannel:v0.25.1-linuxarm64" already present on machine
  Normal   Created  27m (x4 over 28m)      kubelet  Created container kube-flannel
  Normal   Started  27m (x4 over 28m)      kubelet  Started container kube-flannel
  Warning  BackOff  3m31s (x115 over 28m)  kubelet  Back-off restarting failed container kube-flannel in pod kube-flannel-ds-2tszk_kube-flannel(b97155bd-b848-4272-88d4-0e5fa2f89706)

root@NPU-Atlas-2:/home/lincom# kubectl -n kube-flannel logs kube-flannel-ds-2tszk
Defaulted container "kube-flannel" out of: kube-flannel, install-cni-plugin (init), install-cni (init)

Context

I need a working pod network.

Your Environment

@rbrtbnfgl
Copy link
Contributor

I saw that you are using ARM64. Is only the failing node that is using arm or both?

@flucas1
Copy link

flucas1 commented Oct 15, 2024

I have to use this during the cluster initialization

---
apiVersion: kubeadm.k8s.io/v1beta4
kind: ClusterConfiguration
controllerManager:
  extraArgs:
    - name: allocate-node-cidrs
      value: "true"
    - name: cluster-cidr
      value: "10.244.0.0/16"
    - name: node-cidr-mask-size
      value: "24"
kubernetesVersion: stable
networking:
  dnsDomain: cluster.local
  podSubnet: 10.244.0.0/16
scheduler: {}

kubeadm init --v=6 --config=<config.yaml>

@lengcangche-gituhub
Copy link
Author

I saw that you are using ARM64. Is only the failing node that is using arm or both?

Both are ARM64

@lengcangche-gituhub
Copy link
Author

I have to use this during the cluster initialization

---
apiVersion: kubeadm.k8s.io/v1beta4
kind: ClusterConfiguration
controllerManager:
  extraArgs:
    - name: allocate-node-cidrs
      value: "true"
    - name: cluster-cidr
      value: "10.244.0.0/16"
    - name: node-cidr-mask-size
      value: "24"
kubernetesVersion: stable
networking:
  dnsDomain: cluster.local
  podSubnet: 10.244.0.0/16
scheduler: {}

kubeadm init --v=6 --config=<config.yaml>

I think your configuration is the same as mine.

@rbrtbnfgl
Copy link
Contributor

Are you sure that you don't have any logs from the failing pod?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants