Skip to content

Commit ace27ac

Browse files
Update ADR for Longhorn as storage solution: revise context, criteria, and decision outcome
1 parent fcad517 commit ace27ac

File tree

1 file changed

+56
-11
lines changed

1 file changed

+56
-11
lines changed
Original file line numberDiff line numberDiff line change
@@ -1,31 +1,76 @@
11
---
22
title: "Longhorn_as_storage_solution"
3-
date: "2025-02-19"
3+
date: "2025-03-18"
44
---
55

66

77
| status: | date: | decision-makers: |
88
| --- | --- | --- |
9-
| proposed | 2025-02-19 | Alexandra Aldershaab |
9+
| proposed | 2025-03-18 | Sofus Albertsen |
1010

1111

1212
## Context and Problem Statement
1313

14-
{Describe the context and problem statement, e.g., in free form using two to three sentences or in the form of an illustrative story. You may want to articulate the problem in form of a question and add links to collaboration boards or issue management systems.}
14+
### Do i need it?
15+
16+
Even though Kubernetes (and their Container Storage Interface) is production ready for persistency and statefull workloads, keeping your cluster stateless have several advantages:
17+
18+
* Dead simple disaster recovery (and duplication) of your cluster: everything is defined as code, so (re)creating a cluster is as easy as running your setupscripts once more and wait for everything to come up.
19+
* Backup and restore has never been simple, and Kubernetes is not solving this for you.
20+
21+
At it's core, we want stateless [cloud native applications](https://kodekloud.com/blog/cloud-native-principles-explained/).
22+
Remember the distinction between the need for persistency and ephemeral storage; your chaching service needs ephemeral storage, but does not need backup/restore of said data. Those are perfectly fitting for Kubernetes.
23+
24+
Often time your Database is sitting on some special hardware, and is catered to by specialized competences.
25+
Keep it that way, and connect to the database from your cluster.
26+
27+
### What are the criterias for choosing this?
28+
29+
* **Performance Requirements:** What are the expected Input/Output Operations Per Second (IOPS) that your applications will demand? What about throughput?
30+
* **Scalability Needs:** Are you able to add storage seamlessly after the initial creation?
31+
* **Data Availability and Durability:** What are you requrements to replica's, time to recover etc.
32+
* **Team Expertise and Comfort Level:** What is your team's existing knowledge and experience with the specific storage solutions you are considering?
33+
34+
### What are our weights for making a choice?
35+
36+
While all criterias are important, choosing one persistency tool over the other can have vastly different requirements to the expertise of the team.
37+
38+
Therefore, the primary weight is; if you have a storage solution that already supports Kubernetes with a CSI driver, evaluate that one before anything else.
39+
40+
Secondary, it is the complexity introduced that we will focus on.
41+
1542

1643
## Considered Options
1744

18-
* Longhorn
19-
* Rook Ceph
20-
* OpenEBS
45+
* **Longhorn:** A lightweight, reliable, and easy-to-use distributed block storage system for Kubernetes. It's built by Rancher (now SUSE). Key features include built-in snapshots, backups, replication, and a user-friendly GUI. It's designed specifically *for* Kubernetes and integrates deeply.
46+
47+
* **Rook Ceph:** Rook is a storage *operator* for Kubernetes, and Ceph is a highly scalable, distributed storage system offering object, block, and file storage. Rook automates deployment, management, and scaling of Ceph within Kubernetes. This combination is powerful but complex.
48+
49+
* **OpenEBS:** A containerized storage solution that provides persistent block storage for Kubernetes applications. It offers several different "storage engines" (cStor, Jiva, Local PV, Mayastor), each with different performance and feature characteristics. Offers flexibility but can require careful selection of the right engine.
50+
51+
* **Portworx:** A commercial (paid) storage platform designed for Kubernetes. It offers high performance, high availability, and advanced features like data encryption, storage-level snapshots, and automated scaling. It's a mature and feature-rich solution, but comes with licensing costs.
52+
2153

2254
## Decision Outcome
2355

24-
Chosen option: Longhorn, because {justification. e.g., only option, which meets k.o. criterion decision driver | which resolves force {force} | … | comes out best (see below)}.
2556

26-
<!-- This is an optional element. Feel free to remove. -->
57+
Chosen option: **Longhorn**, because it provides a good balance of features, ease of use, and integration with Kubernetes, while minimizing the complexity overhead for our team.
58+
59+
It's a strong, open-source option that aligns well with our focus on simplicity.
60+
It meets our needs for persistent storage within the cluster without introducing the operational overhead of a more complex solution like Rook/Ceph.
61+
62+
63+
Based on different scenarios, the following general recommendations can be made:
64+
65+
- For organizations that require massive scalability, support for block, object, and file storage within a unified system, and have a team with the necessary expertise to manage a complex distributed storage platform, Ceph/Rook is a powerful option.
66+
- For users who are looking for a straightforward and reliable distributed block storage solution that is easy to deploy and manage within Kubernetes, especially for smaller to medium-sized environments, Longhorn is an excellent choice.
67+
- If you lack the required skillset all together, and need a high-performance, feature-rich, and commercially supported solution, and where the licensing costs are justified then Portworx is our recommendation.
68+
69+
2770
### Consequences
2871

29-
* Good, because {positive consequence, e.g., improvement of one or more desired qualities, …}
30-
* Bad, because {negative consequence, e.g., compromising one or more desired qualities, …}
31-
*<!-- numbers of consequences can vary -->
72+
* Good, because it's relatively easy to deploy and manage, leading to lower operational overhead and faster time to value. It has good community support and active development.
73+
* Good, because Longhorn's performance is generally good for typical workloads, meeting our initial performance requirements.
74+
75+
* Bad, because Longhorn is primarily focused on block storage. If we need robust support for shared filesystems (ReadWriteMany access mode with full POSIX compliance) *within* the cluster, we might need to wait untill newer versions of the tools supports this (see their [roadmap for more information](https://github.com/longhorn/longhorn/wiki/Roadmap#longhorn-v111-january-2026))
76+
* Bad, because, although Longhorn has a growing community, it's not as mature as Ceph. While this is less of a direct "consequence" and more of a relative comparison, it's worth keeping in mind for long-term planning.

0 commit comments

Comments
 (0)