-
Notifications
You must be signed in to change notification settings - Fork 323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[EKS][auto mode] Workloads fail due to inability to locate kube-dns service #2546
Comments
@truongnht hi, thanks for the info. As a workaround you can create a dummy kube-dns service in your cluster to unblock. |
Hello, I'm facing the same issue. Loki gateway seems to be basically an nginx container with a specific config. In order to resolve DNS for proxy passes within Kubernetes it sets the
Not familiar with the loki repo, but this seems very similar to what appears in the I've tried your suggestion by applying a dummy kube-dns service (basically a copy paste of what core dns creates): apiVersion: v1
kind: Service
metadata:
annotations:
prometheus.io/port: "9153"
prometheus.io/scrape: "true"
labels:
eks.amazonaws.com/component: kube-dns
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
kubernetes.io/name: CoreDNS
name: kube-dns
namespace: kube-system
spec:
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: dns
port: 53
protocol: UDP
targetPort: 53
- name: dns-tcp
port: 53
protocol: TCP
targetPort: 53
- name: metrics
port: 9153
protocol: TCP
targetPort: 9153
selector:
k8s-app: kube-dns
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {} It actually does get an IP, but now I get the following error:
So my assumption is that the app selector is incorrect, but maybe it's something else. @oliviassss any idea what I should adjust here/could try? |
@oliviassss I believe info provided by @wimspaargaren is pretty complete. To get over that issue, I have enabled |
Thanks @truongnht will do that as well then to work around the issue for now. |
@wimspaargaren, @truongnht, thanks for the details. We will reproduce this error internally. @wimspaargaren I think the issue for dummy kube-dns service is that it's not assigned with a correct CLUSTER IP for the DNS resolution. In auto mode, we have the coredns server listening to the CLUSTER DNS IP on port 53. If the dummy service is not created with the same IP, the dns resolve would fail. @truongnht, once you re-install the coredns addon, if you delete the deployment(so no coredns pods are gone, only kube-dns service retain), would the loki-gateway still work? |
@oliviassss We ran into the same issue as @wimspaargaren described above, also for When I created a dummy "kube-dns" service explicitly with the ClusterIP set (I grabbed the IP from another pod |
Hi all, we are rolling out a fix for this issue in EKS auto this week, we map the kube-dns FQDN with the cluster DNS IP in host file, so you no longer need to create a service if you are just to resolve the FQDN. |
@oliviassss may I know what else are fixed for eks auto-mode? Interested to learn more on the releases. |
|
@oliviassss thanks, do you have those information published somewhere? |
This fix as described above has been fully deployed, closing |
In typical system we deploy, coredns (kube-dns) is installed in kube-system namespace which our observability related services like Loki (loki-gateway) and Tempo (tempo-gateway) depend on for resolving its related services. Now with auto-mode enabled, we donot install coredns addons and therefore get into the service name resolution issues.
The text was updated successfully, but these errors were encountered: