|
| 1 | +--- |
| 2 | +summary: Associate tags with your cache keys to easily invalidat a bunch of keys at once |
| 3 | +--- |
| 4 | + |
| 5 | +# Tags |
| 6 | + |
| 7 | +:::warning |
| 8 | +Tags are available since v1.2.0 and still **experimental**. |
| 9 | + |
| 10 | +We will **not** make breaking changes without a major version, but no guarantees are made about the stability of this feature yet. |
| 11 | + |
| 12 | +Please if you find any bugs, report them on Github issues. |
| 13 | +::: |
| 14 | + |
| 15 | + |
| 16 | +Tagging allows associating a cache entry with one or more tags to simplify invalidation. Instead of managing individual keys, entries can be grouped under multiple tags and invalidated in a single operation. |
| 17 | + |
| 18 | +## Usage |
| 19 | + |
| 20 | +```ts |
| 21 | +await bento.getOrSet({ |
| 22 | + key: 'foo', |
| 23 | + factory: getFromDb(), |
| 24 | + tags: ['tag-1', 'tag-2'] |
| 25 | +}); |
| 26 | + |
| 27 | +await bento.set({ key: 'foo', tags: ['tag-1'] }); |
| 28 | +``` |
| 29 | + |
| 30 | +To invalidate all entries linked to a tag: |
| 31 | + |
| 32 | +```ts |
| 33 | +await bento.deleteByTags({ tags: ['tag-1'] }); |
| 34 | +``` |
| 35 | + |
| 36 | +Now, imagine that the tags depend on the cached value itself. In that case, you can use [adaptive caching](./adaptive_caching.md) to update tags dynamically based on the computed value. |
| 37 | + |
| 38 | +```ts |
| 39 | +const product = await bento.getOrSet({ |
| 40 | + key: `product:${id}`, |
| 41 | + factory: async (ctx) => { |
| 42 | + const product = await fetchProduct(id); |
| 43 | + ctx.setTags(product.tags); |
| 44 | + return product; |
| 45 | + } |
| 46 | +}) |
| 47 | +``` |
| 48 | + |
| 49 | + |
| 50 | +## How it works |
| 51 | + |
| 52 | +If you are interested in how Bentocache handles tags internally, read on. |
| 53 | + |
| 54 | +Generally, there are two ways to implement tagging in a cache system: |
| 55 | + |
| 56 | +- **Server-side tagging**: The cache backend (e.g., Redis, Memcached, databases) is responsible for managing tags and their associated entries. However, most distributed caches do not natively support tagging. When it's not supported, workarounds exist, but they are either inefficient or complex to implement. |
| 57 | + |
| 58 | +- **Client-side tagging**: The caching library manages tags internally. This is the approach used by Bentocache. |
| 59 | + |
| 60 | +Bentocache implements **client-side tagging**, making it fully backend-agnostic. Instead of relying on the cache backend to track and delete entries by tags, Bentocache tracks invalidation timestamps for each tag and filters out stale data dynamically. |
| 61 | + |
| 62 | +This means all Bentocache drivers automatically support tagging without any modification. If someone implements a custom driver, tagging will work out of the box, without requiring any additional logic. |
| 63 | + |
| 64 | +### Why avoid server-side tagging |
| 65 | + |
| 66 | +Among all the cache backends Bentocache supports, none provide a native tagging system without significant overhead. Of course, something could probably be hacked together on top of all drivers, but it would probably be inefficient and also pretty complex to implement, depending on the backend. |
| 67 | + |
| 68 | +For example: |
| 69 | + |
| 70 | +- In Redis, tagging could be hacked together using Redis sets, but this would require complex management and would not be efficient for large datasets. |
| 71 | +- In databases, a separate table mapping cache keys to tags could be used, but this would significantly increase query complexity and also reduce performance. |
| 72 | + |
| 73 | +By performance, I mean that to delete all keys associated with a tag, you’d typically need to run a query like: |
| 74 | + |
| 75 | +```sql |
| 76 | +DELETE * FROM cache WHERE "my-tag" IN tags; |
| 77 | +``` |
| 78 | + |
| 79 | +This approach does not scale in a distributed cache with millions of entries, as scanning large datasets in real-time would be extremely slow and inefficient. |
| 80 | + |
| 81 | +### How Bentocache handles tags |
| 82 | + |
| 83 | +Instead of directly deleting entries with a given tag, Bentocache uses a more efficient approach. |
| 84 | + |
| 85 | +Core idea is pretty simple: |
| 86 | +- When a tag is invalidated, Bentocache stores an **invalidation timestamp** in the cache. |
| 87 | +- When fetching an entry, Bentocache checks whether it was cached before or after its associated tags were invalidated. |
| 88 | + |
| 89 | +Let's take a concrete example. Here we just cached an entry with the tags `tag-1` and `tag-2`: |
| 90 | + |
| 91 | +```ts |
| 92 | +await bento.getOrSet({ |
| 93 | + key: 'foo', |
| 94 | + factory: getFromDb(), |
| 95 | + tags: ['tag-1', 'tag-2'] |
| 96 | +}); |
| 97 | +``` |
| 98 | + |
| 99 | +Internally, Bentocache stores something like: |
| 100 | + |
| 101 | +```ts |
| 102 | +foo = { value: 'bar', tags: ['tag-1', 'tag-2'], createdAt: 1700000 } |
| 103 | +``` |
| 104 | + |
| 105 | +Note that we also store the creation date of the entry as `createdAt`. |
| 106 | + |
| 107 | +Now, we invalidate the `tag-1` tag: |
| 108 | + |
| 109 | +```ts |
| 110 | +await bento.deleteByTags({ tags: ['tag-1'] }); |
| 111 | +``` |
| 112 | + |
| 113 | +Instead of scanning and deleting every entry associated with `tag-1`, Bentocache simply stores the invalidation timestamp under a special cache key: |
| 114 | + |
| 115 | +```ts |
| 116 | +__bentocache:tags:tag-1 = { invalidatedAt: 1701234 } |
| 117 | +``` |
| 118 | + |
| 119 | +So, we store the invalidation timestamp of the tag under the key `__bentocache:tags:tag-1`. This means that any cache entry associated with tag-1 created before `1700001234` is now considered stale. |
| 120 | + |
| 121 | +Now, when fetching an entry, Bentocache checks if it was created before the tag was invalidated. If it was, Bentocache considers the entry stale and ignores it. |
| 122 | + |
| 123 | +In fact, the implementation is a bit more complex than that, but that's the general idea. |
| 124 | + |
| 125 | +## Limitations |
| 126 | + |
| 127 | +The main limitation of this system is that you should avoid using too many tags on a single entry. The more tags you use per entry, the more invalidation timestamps Bentocache needs to store and especially check when fetching an entry. This can increase lookup times and impact performance. |
| 128 | + |
| 129 | +In fact, the same issue exists in other systems like Loki, OpenTelemetry, TimescaleDB etc.. where it's known as the "high cardinality" problem. To maintain optimal performance, it's recommended to **keep the number of tags per entry reasonable**. |
| 130 | + |
| 131 | +## Acknowledgements |
| 132 | + |
| 133 | +The concept of client-side tagging in Bentocache was **heavily inspired** by the huge work done by Jody Donetti on [FusionCache](https://github.com/ZiggyCreatures/FusionCache). |
| 134 | + |
| 135 | +Reading his detailed explanations and discussions on GitHub provided invaluable insights into the challenges and solutions in implementing an efficient tagging system. |
| 136 | + |
| 137 | +A **huge thanks** for sharing his expertise and paving the way for innovative caching strategies |
0 commit comments