Skip to content

added content to decision matrix for getting your software ready #24

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 29, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 8 additions & 14 deletions docs/hardware_ready/ADRs/Longhorn_as_storage_solution.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,28 +8,27 @@ date: "2025-03-18"
| --- | --- | --- |
| approved | 2025-03-18 | Sofus Albertsen |


## Context and Problem Statement

### Do i need it?

Even though Kubernetes (and their Container Storage Interface) is production ready for persistency and statefull workloads, keeping your cluster stateless have several advantages:

* Dead simple disaster recovery (and duplication) of your cluster: everything is defined as code, so (re)creating a cluster is as easy as running your setupscripts once more and wait for everything to come up.
* Backup and restore has never been simple, and Kubernetes is not solving this for you.
* Backup and restore has never been simple, and Kubernetes is not solving this for you.

At it's core, we want stateless [cloud native applications](https://kodekloud.com/blog/cloud-native-principles-explained/).
Remember the distinction between the need for persistency and ephemeral storage; your chaching service needs ephemeral storage, but does not need backup/restore of said data. Those are perfectly fitting for Kubernetes.

Often time your Database is sitting on some special hardware, and is catered to by specialized competences.
Often time your Database is sitting on some special hardware, and is catered to by specialized competences.
Keep it that way, and connect to the database from your cluster.

### What are the criterias for choosing this?

* **Performance Requirements:** What are the expected Input/Output Operations Per Second (IOPS) that your applications will demand? What about throughput?
* **Scalability Needs:** Are you able to add storage seamlessly after the initial creation?
* **Data Availability and Durability:** What are you requrements to replica's, time to recover etc.
* **Team Expertise and Comfort Level:** What is your team's existing knowledge and experience with the specific storage solutions you are considering?
* **Team Expertise and Comfort Level:** What is your team's existing knowledge and experience with the specific storage solutions you are considering?

### What are our weights for making a choice?

Expand All @@ -39,7 +38,6 @@ Therefore, the primary weight is; if you have a storage solution that already su

Secondary, it is the complexity introduced that we will focus on.


## Considered Options

* **Longhorn:** A lightweight, reliable, and easy-to-use distributed block storage system for Kubernetes. It's built by Rancher (now SUSE). Key features include built-in snapshots, backups, replication, and a user-friendly GUI. It's designed specifically *for* Kubernetes and integrates deeply.
Expand All @@ -50,22 +48,18 @@ Secondary, it is the complexity introduced that we will focus on.

* **Portworx:** A commercial (paid) storage platform designed for Kubernetes. It offers high performance, high availability, and advanced features like data encryption, storage-level snapshots, and automated scaling. It's a mature and feature-rich solution, but comes with licensing costs.


## Decision Outcome


Chosen option: **Longhorn**, because it provides a good balance of features, ease of use, and integration with Kubernetes, while minimizing the complexity overhead for our team.
Chosen option: **Longhorn**, because it provides a good balance of features, ease of use, and integration with Kubernetes, while minimizing the complexity overhead for our team.

It's a strong, open-source option that aligns well with our focus on simplicity.
It meets our needs for persistent storage within the cluster without introducing the operational overhead of a more complex solution like Rook/Ceph.

Based on different scenarios, the following general recommendations can be made:

Based on different scenarios, the following general recommendations can be made:

- For organizations that require massive scalability, support for block, object, and file storage within a unified system, and have a team with the necessary expertise to manage a complex distributed storage platform, Ceph/Rook is a powerful option.
- For users who are looking for a straightforward and reliable distributed block storage solution that is easy to deploy and manage within Kubernetes, especially for smaller to medium-sized environments, Longhorn is an excellent choice.
- If you lack the required skillset all together, and need a high-performance, feature-rich, and commercially supported solution, and where the licensing costs are justified then Portworx is our recommendation.

* For organizations that require massive scalability, support for block, object, and file storage within a unified system, and have a team with the necessary expertise to manage a complex distributed storage platform, Ceph/Rook is a powerful option.
* For users who are looking for a straightforward and reliable distributed block storage solution that is easy to deploy and manage within Kubernetes, especially for smaller to medium-sized environments, Longhorn is an excellent choice.
* If you lack the required skillset all together, and need a high-performance, feature-rich, and commercially supported solution, and where the licensing costs are justified then Portworx is our recommendation.

### Consequences

Expand Down
10 changes: 9 additions & 1 deletion docs/software_ready/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,12 @@ title: Getting your software ready

| Problem domain | Description | Reason for importance | Tool recommendation |
|:---:|:---:|:---:|:---:|
| Image Registry | A common place to store and fetch images | High availability, secure access control | Harbor |
| Image Registry | A common place to store and fetch images | High availability, secure access control | |
| Secret Management | Securely store and manage sensitive information like passwords and API keys | Prevent unauthorized access and data leaks | |
| Ingress Controller / Gateway API | Manage external access to services in the cluster | Enable routing, load balancing, and secure communication | |
| GitOps / Deployment Pipelines | Automate application deployments using Git as the source of truth | Ensure consistency, traceability, and faster deployments | |
| Monitoring Infrastructure | Observe and analyze the health and performance of the cluster and applications | Proactive issue detection and resolution | |
| Service Mesh | Manage service-to-service communication within the cluster | Enable observability, security, and traffic control | |
| Network Policies | Define rules for communication between pods and services | Enhance security by restricting unauthorized traffic | |
| Authorization Integration | Manage user and service access to cluster resources | Enforce role-based access control and compliance | |
| Container Scanning | Identify vulnerabilities in container images | Ensure secure and compliant deployments | |