|
| 1 | +--- |
| 2 | +title: "Use Talos OS as the Preferred Operating System for Kubernetes Operations" |
| 3 | +date: "2025-02-25" |
| 4 | +--- |
| 5 | + |
| 6 | + |
| 7 | +| status: | date: | decision-makers: | |
| 8 | +| --- | --- | --- | |
| 9 | +| proposed | 2025-02-25 | Sofus Albertsen | |
| 10 | + |
| 11 | + |
| 12 | +## Context and Problem Statement |
| 13 | + |
| 14 | +Choosing the right operating system for your Kubernetes cluster is crucial for stability, security, and operational efficiency. The OS should be optimized for container workloads, minimize overhead, and integrate well with Infrastructure as Code (IaC) practices. |
| 15 | +## Considered Options |
| 16 | + |
| 17 | +* Talos OS |
| 18 | +* Red Hat OpenShift |
| 19 | +* SUSE Rancher (RancherOS/RKE) |
| 20 | + |
| 21 | +## Decision Outcome |
| 22 | + |
| 23 | +Chosen option: **Talos OS**, because its minimal footprint, API-driven configuration, and singular focus on Kubernetes make it ideal for automated infrastructure management and reduce operational overhead. |
| 24 | + |
| 25 | +Talos OS's immutable architecture and security-focused design further enhance its suitability for Kubernetes deployments, giving you a minimal attack surface from the OS point of view. As an example, the OS does not have any shell, so no bash scripts can be executed. |
| 26 | + |
| 27 | +OpenShift and Rancher were considered, but their comprehensive feature sets, while beneficial in some scenarios, introduce increased complexity and overhead. |
| 28 | + |
| 29 | +While their dashboards can simplify initial setup, they can also encourage "click-ops" and deviate from IaC best practices. These platforms might be suitable if existing Red Hat or SUSE expertise is a primary driver, but becuase they are fully fledged OS's underneath, they introduce more operational overhead than Talos. |
| 30 | + |
| 31 | +### Consequences |
| 32 | + |
| 33 | +* **Good:** Talos OS's minimal package selection makes it a smaller attack surface. |
| 34 | +* **Good:** The API-driven configuration of Talos OS allows for seamless integration with IaC tools like Terraform, enabling fully automated cluster provisioning and management. |
| 35 | +* **Good:** The immutable infrastructure of Talos OS simplifies updates and adds recilliency because of it's dual boot bank setup. |
| 36 | +* **Good:** The "two package" approach simplifies maintenance (day 2 operations) and reduces the likelihood of OS-related issues, as all known package combinations can be tested from the vendor. |
| 37 | + |
| 38 | +* **Bad:** The learning curve for Talos OS might be steeper initially for teams unfamiliar with its API-driven approach. |
| 39 | +* **Bad:** The lack of a graphical user interface might be a drawback for some users accustomed to traditional OS management. |
| 40 | +* **Bad:** Talos is a relatively newer project compared to OpenShift or Rancher, therefore community support and available resources might be smaller. |
0 commit comments