Label Normalization Proposal #1597

maruiz93 · 2026-01-16T14:23:56Z

maruiz93
Jan 16, 2026
Maintainer

⚠️ This document was created with Claude CLI. Some of it has been reviewed and checked but not entirely so there are some issues.

Label Normalization Proposal - Full Normalization Schema

Executive Summary

This document proposes normalizing Kubernetes resource labels from JSONB into relational tables using full normalization to enhance performance with label-filtered queries while minimizing storage overhead from value duplication.

Current Problem:

Label-filtered queries can take 7+ minutes when filtering by common labels (e.g., application name, pipeline type). This happens because:

PostgreSQL combines GIN indexes (JSONB labels) with BTREE indexes (kind, namespace, timestamp)
When both filters match many rows, PostgreSQL must recheck hundreds of thousands of rows
Common labels like app.example.com/name=app-1 often match 10K-100K+ resources (low selectivity)
work_mem dependency: When bitmaps exceed work_mem (default: 4MB), PostgreSQL switches to "lossy mode" which requires rechecking ALL rows in matched pages, causing massive overhead
The problem gets worse as the database grows

📊 Click to see detailed technical analysis and query execution internals

Example Query Performance

Query: Find PipelineRuns with app.example.com/name=app-1 created after a certain date

Current performance:

Execution time: 7 minutes
Rows rechecked: 340,838
GIN index finds: 362K resources (29% of database)
BTREE index finds: 110K PipelineRuns (8.8% of database)

How PostgreSQL Executes These Queries

When querying with label filters, PostgreSQL must combine two different index types:

1. GIN Index Scan on data->'metadata'->'labels':

Finds all resources matching the label containment condition
Example: @> '{"app.example.com/name":"app-1"}' returns 362,394 row IDs
Fast index lookup, but returns many rows for non-selective labels

2. BTREE Index Scan on (kind, api_version, namespace, creationTimestamp):

Finds all resources matching kind, namespace, and timestamp filters
Example: PipelineRuns in namespace after date returns 110,832 row IDs
Also fast, but returns many rows for broad time ranges

3. BitmapAnd Operation:

PostgreSQL combines both result sets using bitmap data structure
Creates in-memory bitmap of matching rows from each index
Intersects bitmaps to find rows matching BOTH conditions

The work_mem Limitation

PostgreSQL's work_mem parameter (default: 4MB) determines how much memory each bitmap can use:

When bitmap fits in work_mem (non-lossy):

Bitmap stores exact row positions (TIDs)
Each TID is 6 bytes: (block_number, offset_within_block)
4MB can hold ~682K exact TIDs
Recheck only verifies the JSONB condition (fast)

When bitmap exceeds work_mem (lossy):

PostgreSQL switches to page-level tracking
Bitmap stores only block numbers, not row offsets
Each page is 8KB, can contain ~100-200 rows
Must recheck ALL rows in matched pages
Example: 362K matching rows across 16,250 pages
- If label is evenly distributed: ~340K rows need full rechecking
- Includes reading non-matching rows from those pages

Why Non-Selective Labels Cause Problems

Selectivity = (rows matching filter) / (total rows)

Highly selective (<0.1%): e.g., specific git commit SHA
- Bitmap fits in work_mem
- Few pages to recheck
- Query time: 10-50ms ✅
Low selectivity (>1%): e.g., application name, pipeline type
- Bitmap likely exceeds work_mem
- Many pages to recheck (lossy bitmap)
- Query time: 30 seconds - 10 minutes ❌

Common Kubernetes labels with low selectivity (>1% of resources):

app.kubernetes.io/managed-by: pipelinesascode.tekton.dev → ALL managed resources
app.example.com/name: app-1 → ALL resources for one application (10K-100K+)
pipeline.example.com/type: build → ALL build pipelines
namespace: tenant-a → ALL resources in namespace

As the database grows, these labels match more rows, making the problem worse.

Real-World Example: Query Plan Analysis

Database: 1.25 million resources, 78K PipelineRuns in namespace

Query: Find PipelineRuns with app.example.com/name=app-1 after 2025-10-15

Execution:

BitmapAnd (cost=6987.92..6987.92 rows=9 width=0)
  -> Bitmap Index Scan on idx_json_labels
       Index Cond: labels @> '{"app.example.com/name": "app-1"}'
       Rows Found: 362,394 (29% of all resources) ← NON-SELECTIVE
       Bitmap Size: ~2 MB (exceeds work_mem in some configs)

  -> Bitmap Index Scan on idx_resource_kind_api_ns_ts_id
       Index Cond: (kind='PipelineRun' AND namespace='...' AND created_at > '...')
       Rows Found: 110,832 (8.8% of all resources) ← NON-SELECTIVE
       Bitmap Size: ~650 KB

Bitmap Heap Scan on resource
  Recheck Cond: (labels @> ... AND kind='PipelineRun' ...)
  Rows Removed by Index Recheck: 340,838 ← MASSIVE OVERHEAD
  Heap Blocks: exact=7287 lossy=8963 ← 55% OF BLOCKS ARE LOSSY

Execution Time: 445486.145 ms (7 minutes 25 seconds)

Why so slow:

55% of heap blocks are lossy (work_mem exhausted)
Must read and recheck 340K+ rows to find 13K actual matches
JSONB containment check (@>) is expensive per row
I/O overhead reading 130 MB of heap blocks

Solutions Comparison

Approach	Selectivity Handling	work_mem Dependency	Query Time
Current (JSONB GIN)	Poor for non-selective filters	Yes - lossy if exceeded	7 minutes
Increase work_mem	Delays lossy mode	Still O(n) rechecks	2-3 minutes
Indexed Paths (BTREE on specific labels)	Good if indexed	No - direct lookups	<100ms
Full Normalization (this proposal)	Excellent - all labels indexed	No - always exact lookups	<100ms

Recommendation: Full normalization eliminates bitmap scan issues entirely by using direct BTREE index lookups.

Proposed Solution:

Full normalization: Separate tables for keys, values, and key-value pairs
Direct BTREE index lookups instead of bitmap scans (no work_mem limitation)
Eliminates massive duplication of common label key-value pairs
Expected query time: < 100ms (99% improvement)
Significant storage savings from deduplicating keys, values, and pairs
Optimized schema with label_pair table for maximum efficiency

Schema Design

Full Normalization: Four-Table Design

                        ┌─────────────────────────────┐
                        │         resource            │
                        ├─────────────────────────────┤
                        │ id (PK)                     │
                        │ uuid (UNIQUE)               │
                        │ api_version                 │
                        │ kind                        │
                        │ namespace                   │
                        │ name                        │
                        │ resource_version            │
                        │ created_at                  │
                        │ updated_at                  │
                        │ cluster_updated_ts          │
                        │ cluster_deleted_ts          │
                        │ data (JSONB)                │
                        └─────────────────────────────┘
                                     │
                                     │ 1:N
                                     ▼
                        ┌─────────────────────────────┐
                        │      resource_label         │
                        ├─────────────────────────────┤
                        │ id (PK)                     │
                        │ resource_id (FK)            │
                        │ pair_id (FK)                │
                        │ created_at                  │
                        │ updated_at                  │
                        │                             │
                        │ UNIQUE(resource_id,         │
                        │        pair_id)             │
                        └─────────────────────────────┘
                                     │
                                     │ N:1
                                     ▼
┌────────────────┐     ┌─────────────────────────────┐     ┌────────────────┐
│  metadata_key  │     │      metadata_pair          │     │ metadata_value │
├────────────────┤     ├─────────────────────────────┤     ├────────────────┤
│ id (PK)        │ N:1 │ id (PK)                     │ N:1 │ id (PK)        │
│ key UNIQUE     │◄────│ key_id (FK)                 │────►│ value UNIQUE   │
│ created_at     │     │ value_id (FK)               │     │ created_at     │
│ updated_at     │     │ created_at                  │     │ updated_at     │
└────────────────┘     │ updated_at                  │     └────────────────┘
                       │                             │
                       │ UNIQUE(key_id, value_id)    │
                       └─────────────────────────────┘

Indexes on metadata_pair:
  • unique_metadata_pair (key_id, value_id)  ← Ensures each pair stored once

Indexes on resource_label:
  • unique_resource_label (resource_id, pair_id)  ← Also handles resource_id lookups
  • idx_resource_label_pair (pair_id)  ← Fast lookup of resources by label pair

Indexes on metadata_key:
  • unique_metadata_key (key)

Indexes on metadata_value:
  • unique_metadata_value (value)

Note: Generic table names (metadata_key, metadata_value, metadata_pair) allow future reuse for annotations via a separate resource_annotation table that references the same metadata_pair table.

Table Definitions

Table 1: `resource` (Existing - No Changes)

-- Existing table structure (unchanged)
CREATE TABLE public.resource (
    id BIGSERIAL PRIMARY KEY,
    uuid uuid UNIQUE NOT NULL,
    api_version character varying NOT NULL,
    kind character varying NOT NULL,
    name character varying NOT NULL,
    namespace character varying NOT NULL,
    resource_version character varying,
    created_at timestamp with time zone DEFAULT now() NOT NULL,
    updated_at timestamp with time zone DEFAULT now() NOT NULL,
    cluster_updated_ts timestamp with time zone NOT NULL,
    cluster_deleted_ts timestamp with time zone,
    data jsonb NOT NULL  -- Still contains full Kubernetes manifest
);

-- Existing indexes remain
CREATE INDEX idx_creation_timestamp_id ON resource
    USING btree ((data->'metadata'->>'creationTimestamp') DESC, id DESC);

CREATE INDEX resource_kind_namespace_name_idx ON resource
    USING btree (kind, api_version, namespace, name);

Note: The data JSONB field is kept for full manifest storage. Labels are duplicated in relational tables for query performance.

Table 2: `metadata_key` (New - Stores Unique Metadata Keys)

-- New table: stores unique metadata keys (for labels, annotations, etc.)
CREATE TABLE public.metadata_key (
    id BIGSERIAL PRIMARY KEY,
    key TEXT NOT NULL,
    created_at timestamp with time zone DEFAULT now() NOT NULL,
    updated_at timestamp with time zone DEFAULT now() NOT NULL,

    -- Ensure uniqueness: each key stored only once
    CONSTRAINT unique_metadata_key UNIQUE (key)
);

-- Index for key lookups (covered by UNIQUE constraint)
-- CREATE UNIQUE INDEX idx_metadata_key ON metadata_key(key);

Purpose:

Stores each unique metadata key once (e.g., app.kubernetes.io/managed-by)
Eliminates duplication of long key strings across key-value pairs
Generic design allows reuse for both labels and annotations
Example: app.kubernetes.io/managed-by stored once, not once per value

Estimated rows: ~500-1000 unique keys for labels, ~500-1000 for annotations (relatively small)

Table 3: `metadata_value` (New - Stores Unique Metadata Values)

-- New table: stores unique metadata values
CREATE TABLE public.metadata_value (
    id BIGSERIAL PRIMARY KEY,
    value TEXT NOT NULL,
    created_at timestamp with time zone DEFAULT now() NOT NULL,
    updated_at timestamp with time zone DEFAULT now() NOT NULL,

    -- Ensure uniqueness: each value stored only once
    CONSTRAINT unique_metadata_value UNIQUE (value)
);

-- Index for value lookups (covered by UNIQUE constraint)
-- CREATE UNIQUE INDEX idx_metadata_value ON metadata_value(value);

Purpose:

Stores each unique metadata value once (e.g., pipelinesascode.tekton.dev, v0.40.0, app-1)
Critical for storage savings: Common values like pipelinesascode.tekton.dev stored once, not millions of times
Generic design allows reuse for both labels and annotations
Enables efficient queries by value (less common, but supported)

Estimated rows: ~10K-100K unique values (depends on cardinality of fields like commit SHAs)

Table 4: `metadata_pair` (New - Stores Unique Key-Value Pairs)

-- New table: stores unique metadata key-value pairs
CREATE TABLE public.metadata_pair (
    id BIGSERIAL PRIMARY KEY,
    key_id BIGINT NOT NULL REFERENCES metadata_key(id) ON DELETE CASCADE,
    value_id BIGINT NOT NULL REFERENCES metadata_value(id) ON DELETE CASCADE,
    created_at timestamp with time zone DEFAULT now() NOT NULL,
    updated_at timestamp with time zone DEFAULT now() NOT NULL,

    -- Ensure each key-value pair is stored only once
    CONSTRAINT unique_metadata_pair UNIQUE (key_id, value_id)
);

-- Index for pair lookups (covered by UNIQUE constraint)
-- CREATE UNIQUE INDEX idx_metadata_pair ON metadata_pair(key_id, value_id);

Purpose:

Stores each unique key-value combination once (e.g., app.kubernetes.io/managed-by=pipelinesascode.tekton.dev)
Critical optimization: Common pairs that appear on thousands/millions of resources are stored once
Generic design allows sharing between labels and annotations (different junction tables, same pairs)
Example: The pair app.kubernetes.io/managed-by=pipelinesascode.tekton.dev stored once, referenced millions of times
Massively reduces the size of junction tables by avoiding duplicate (key_id, value_id) tuples

Estimated rows: ~50K-200K unique pairs for labels, similar for annotations (far fewer than total occurrences)

Table 5: `resource_label` (New - Links Resources to Label Pairs)

-- New junction table: links resources to label pairs (M-N relationship)
CREATE TABLE public.resource_label (
    id BIGSERIAL PRIMARY KEY,
    resource_id BIGINT NOT NULL REFERENCES resource(id) ON DELETE CASCADE,
    pair_id BIGINT NOT NULL REFERENCES metadata_pair(id) ON DELETE CASCADE,
    created_at timestamp with time zone DEFAULT now() NOT NULL,
    updated_at timestamp with time zone DEFAULT now() NOT NULL,

    -- Ensure one resource can't have the same label pair twice
    CONSTRAINT unique_resource_label UNIQUE (resource_id, pair_id)
);

-- Index for label lookups (get all resources with a specific label pair) - MOST IMPORTANT for performance
CREATE INDEX idx_resource_label_pair ON resource_label(pair_id);

Purpose:

Links resources to their labels via references to metadata pairs
Supports efficient lookups in both directions:
- Resource → Labels: Uses unique_resource_label index via prefix matching on resource_id
- Label → Resources: Uses idx_resource_label_pair index on pair_id
This is the largest table but now only contains two integer foreign keys (resource_id, pair_id) instead of three
Storage savings: 16 bytes per row (2 IDs) instead of 24 bytes (3 IDs)
Future extensibility: A similar resource_annotation table can reuse the same metadata_pair table

Estimated rows: ~20-50 million (1M resources × 20-50 labels each)

📋 Example Data with Full Normalization (Click to expand)

Scenario: 3 PipelineRuns with Common Labels

Table: `resource`

id	uuid	kind	api_version	namespace	name	data (JSONB - truncated)
500123	a1b2c3d4-...	PipelineRun	tekton.dev/v1	tenant-a	build-run-abc	`{"kind":"PipelineRun","metadata":{"labels":{...}},...}`
500124	b2c3d4e5-...	PipelineRun	tekton.dev/v1	tenant-a	build-run-def	`{"kind":"PipelineRun","metadata":{"labels":{...}},...}`
500125	c3d4e5f6-...	PipelineRun	tekton.dev/v1	tenant-a	test-run-xyz	`{"kind":"PipelineRun","metadata":{"labels":{...}},...}`

Table: `metadata_key` (Unique Keys)

id	key
1	app.example.com/name
2	pipeline.example.com/type
3	scm.example.com/commit-sha
4	app.kubernetes.io/version
5	app.kubernetes.io/managed-by
6	test.example.com/scenario

Note: Each key string stored once regardless of how many values or resources use it.

Table: `metadata_value` (Unique Values)

id	value
1	app-1
2	build
3	test
4	abc123def456
5	def789abc012
6	v0.40.0
7	pipelinesascode.tekton.dev
8	integration-tests

Note: Each value string stored once. pipelinesascode.tekton.dev appears once even if used by millions of resources.

Table: `metadata_pair` (Unique Key-Value Pairs)

id	key_id	value_id	Meaning
1	1	1	Pair: `app.example.com/name=app-1`
2	2	2	Pair: `pipeline.example.com/type=build`
3	2	3	Pair: `pipeline.example.com/type=test`
4	3	4	Pair: `scm.example.com/commit-sha=abc123def456`
5	3	5	Pair: `scm.example.com/commit-sha=def789abc012`
6	4	6	Pair: `app.kubernetes.io/version=v0.40.0`
7	5	7	Pair: `app.kubernetes.io/managed-by=pipelinesascode.tekton.dev`
8	6	8	Pair: `test.example.com/scenario=integration-tests`

Note: Each unique key-value combination stored once. When thousands of PipelineRuns share the same label, only the pair ID is duplicated in resource_label.

Table: `resource_label` (Resource-Label Associations)

id	resource_id	pair_id	Meaning
1	500123	1	PipelineRun 500123 has label `app.example.com/name=app-1`
2	500123	2	PipelineRun 500123 has label `pipeline.example.com/type=build`
3	500123	4	PipelineRun 500123 has label `scm.example.com/commit-sha=abc123def456`
4	500123	6	PipelineRun 500123 has label `app.kubernetes.io/version=v0.40.0`
5	500123	7	PipelineRun 500123 has label `app.kubernetes.io/managed-by=pipelinesascode.tekton.dev`
6	500124	1	PipelineRun 500124 has label `app.example.com/name=app-1` (SAME pair as 500123)
7	500124	2	PipelineRun 500124 has label `pipeline.example.com/type=build` (SAME pair as 500123)
8	500124	5	PipelineRun 500124 has label `scm.example.com/commit-sha=def789abc012`
9	500124	6	PipelineRun 500124 has label `app.kubernetes.io/version=v0.40.0` (SAME pair as 500123)
10	500124	7	PipelineRun 500124 has label `app.kubernetes.io/managed-by=pipelinesascode.tekton.dev` (SAME pair as 500123)
11	500125	1	PipelineRun 500125 has label `app.example.com/name=app-1` (SAME pair as others)
12	500125	3	PipelineRun 500125 has label `pipeline.example.com/type=test`
13	500125	8	PipelineRun 500125 has label `test.example.com/scenario=integration-tests`

Visual Representation:

PipelineRun 500123 (build-run-abc)
├── pair_id=1: app.example.com/name=app-1
├── pair_id=2: pipeline.example.com/type=build
├── pair_id=4: scm.example.com/commit-sha=abc123def456
├── pair_id=6: app.kubernetes.io/version=v0.40.0
└── pair_id=7: app.kubernetes.io/managed-by=pipelinesascode.tekton.dev

PipelineRun 500124 (build-run-def)
├── pair_id=1: app.example.com/name=app-1 ← SAME pair as 500123
├── pair_id=2: pipeline.example.com/type=build ← SAME pair as 500123
├── pair_id=5: scm.example.com/commit-sha=def789abc012 ← DIFFERENT commit SHA
├── pair_id=6: app.kubernetes.io/version=v0.40.0 ← SAME pair as 500123
└── pair_id=7: app.kubernetes.io/managed-by=pipelinesascode.tekton.dev ← SAME pair as 500123

PipelineRun 500125 (test-run-xyz)
├── pair_id=1: app.example.com/name=app-1 ← SAME pair as others
├── pair_id=3: pipeline.example.com/type=test ← DIFFERENT type
└── pair_id=8: test.example.com/scenario=integration-tests

Key insight: The pair app.example.com/name=app-1 (pair_id=1) appears in all three PipelineRuns, meaning:

Strings "app.example.com/name" and "app-1" stored once in their respective tables
The pair (key_id=1, value_id=1) stored once in label_pair
Only a single integer foreign key (pair_id) duplicated in resource_label for each resource
Storage benefit: Instead of storing "app.example.com/name": "app-1" (34 bytes) thousands of times, or even (key_id, value_id) = 16 bytes thousands of times, we store one 8-byte integer per resource

Query Examples

Query 1: Find All Resources with `app.example.com/name=app-1`

Current JSONB approach (7 minutes):

SELECT data->'metadata'->>'creationTimestamp' as created_at, id, uuid, data
FROM resource
WHERE kind='PipelineRun'
  AND api_version='tekton.dev/v1'
  AND namespace='tenant-a'
  AND data->'metadata'->'labels' @> '{"app.example.com/name":"app-1"}'
ORDER BY data->'metadata'->>'creationTimestamp' DESC, id DESC
LIMIT 30;

New fully normalized approach with metadata_pair (< 100ms):

-- Most efficient: find pair_id first, then use it directly
WITH target_pair AS (
    SELECT mp.id
    FROM metadata_pair mp
    INNER JOIN metadata_key mk ON mk.id = mp.key_id
    INNER JOIN metadata_value mv ON mv.id = mp.value_id
    WHERE mk.key = 'app.example.com/name'
      AND mv.value = 'app-1'
)
SELECT
    r.data->'metadata'->>'creationTimestamp' as created_at,
    r.id,
    r.uuid,
    r.data
FROM resource r
INNER JOIN resource_label rl ON rl.resource_id = r.id
INNER JOIN target_pair tp ON tp.id = rl.pair_id
WHERE r.kind = 'PipelineRun'
  AND r.api_version = 'tekton.dev/v1'
  AND r.namespace = 'tenant-a'
ORDER BY r.data->'metadata'->>'creationTimestamp' DESC, r.id DESC
LIMIT 30;

Alternative simpler version:

SELECT
    r.data->'metadata'->>'creationTimestamp' as created_at,
    r.id,
    r.uuid,
    r.data
FROM resource r
INNER JOIN resource_label rl ON rl.resource_id = r.id
INNER JOIN metadata_pair mp ON mp.id = rl.pair_id
INNER JOIN metadata_key mk ON mk.id = mp.key_id
INNER JOIN metadata_value mv ON mv.id = mp.value_id
WHERE r.kind = 'PipelineRun'
  AND r.api_version = 'tekton.dev/v1'
  AND r.namespace = 'tenant-a'
  AND mk.key = 'app.example.com/name'
  AND mv.value = 'app-1'
ORDER BY r.data->'metadata'->>'creationTimestamp' DESC, r.id DESC
LIMIT 30;

Query Plan (estimated):

Limit  (cost=... rows=30)
  CTE target_pair
    -> Nested Loop
         -> Index Scan using unique_metadata_key on metadata_key  (rows=1)
              Index Cond: (key = 'app.example.com/name')
         -> Nested Loop
              -> Index Scan using unique_metadata_value on metadata_value  (rows=1)
                   Index Cond: (value = 'app-1')
              -> Index Scan using unique_metadata_pair on metadata_pair  (rows=1)
                   Index Cond: (key_id = mk.id AND value_id = mv.id)
  -> Nested Loop
       -> Index Scan using idx_resource_label_pair on resource_label rl  (rows=13000)
            Index Cond: (pair_id = tp.id)
       -> Index Scan using resource_pkey on resource r  (rows=1)
            Index Cond: (id = rl.resource_id)
            Filter: (kind='PipelineRun' AND namespace='tenant-a')
  -> Sort (timestamp DESC)
  -> Limit 30

Result: Direct B-tree index lookups using index on pair_id, no bitmap scans, ~50-100ms

Query 2: Find Resources with Multiple Labels

Find PipelineRuns with app.example.com/name=app-1 AND scm.example.com/commit-sha=abc123def456:

WITH target_pairs AS (
    -- Find pair_ids for each target label
    SELECT mp.id
    FROM metadata_pair mp
    INNER JOIN metadata_key mk ON mk.id = mp.key_id
    INNER JOIN metadata_value mv ON mv.id = mp.value_id
    WHERE (mk.key = 'app.example.com/name' AND mv.value = 'app-1')
       OR (mk.key = 'scm.example.com/commit-sha' AND mv.value = 'abc123def456')
)
SELECT
    r.data->'metadata'->>'creationTimestamp' as created_at,
    r.id,
    r.uuid,
    r.data
FROM resource r
WHERE r.kind = 'PipelineRun'
  AND r.api_version = 'tekton.dev/v1'
  AND r.namespace = 'tenant-a'
  AND r.id IN (
      SELECT rl.resource_id
      FROM resource_label rl
      INNER JOIN target_pairs tp ON tp.id = rl.pair_id
      GROUP BY rl.resource_id
      HAVING COUNT(*) = 2  -- Must have both labels
  )
ORDER BY r.data->'metadata'->>'creationTimestamp' DESC, r.id DESC
LIMIT 30;

Query 3: Get All Labels for a Resource

-- Get all labels for PipelineRun with id=500123
SELECT
    mk.key,
    mv.value
FROM resource_label rl
INNER JOIN metadata_pair mp ON mp.id = rl.pair_id
INNER JOIN metadata_key mk ON mk.id = mp.key_id
INNER JOIN metadata_value mv ON mv.id = mp.value_id
WHERE rl.resource_id = 500123
ORDER BY mk.key;

Result:

                    key                           |              value
--------------------------------------------------+---------------------------------
 app.example.com/name                             | app-1
 app.kubernetes.io/managed-by                     | pipelinesascode.tekton.dev
 app.kubernetes.io/version                        | v0.40.0
 pipeline.example.com/type                        | build
 scm.example.com/commit-sha                       | abc123def456

Query 4: Analytics - Most Common Label Values

-- Find most common values for a specific label key
SELECT
    mk.key,
    mv.value,
    COUNT(rl.resource_id) as resource_count
FROM resource_label rl
INNER JOIN metadata_pair mp ON mp.id = rl.pair_id
INNER JOIN metadata_key mk ON mk.id = mp.key_id
INNER JOIN metadata_value mv ON mv.id = mp.value_id
WHERE mk.key = 'app.example.com/name'
GROUP BY mk.key, mv.value
ORDER BY resource_count DESC
LIMIT 10;

Result:

          key              | value      | resource_count
---------------------------+------------+----------------
 app.example.com/name      | app-1      | 13397
 app.example.com/name      | app-2      | 8234
 app.example.com/name      | app-3      | 5621

💾 Storage Analysis: Problem, Current State, and Solution (Click to expand)

The Storage Problem: Value Duplication

Many label keys and values are repeated across hundreds of thousands or millions of resources:

High-frequency labels (same value on most/all resources):

app.kubernetes.io/managed-by: pipelinesascode.tekton.dev - on ALL managed PipelineRuns
app.kubernetes.io/version: v0.40.0 - on ALL runs with this version
pipeline.example.com/type: build - on ALL build pipelines
scm.example.com/git-provider: github - on ALL GitHub-based pipelines

Medium-frequency labels (same value on thousands of resources):

app.example.com/name: app-1 - on ALL resources for this application (~13K resources)
namespace: tenant-a - on ALL resources in this namespace

Low-frequency labels (unique or rare values):

scm.example.com/commit-sha: eb1befa91218676771a798b14c2931594710538e - unique per commit

Problem: Label values like "pipelinesascode.tekton.dev" (28 chars) are duplicated many times in partial normalization approaches. With full normalization, each unique string is stored once.

Current Storage (JSONB Only)

Estimated for 1M PipelineRuns with 20 labels each:

Component	Size
Labels in JSONB `data` field	~2 GB
GIN index on labels	~500 MB
Total	~2.5 GB

Partial Normalization Storage

Scenario: 1 million PipelineRuns, each with 20 labels

Using partial normalization (label(id, key, value) + resource_label):

Component	Storage	Notes
`label` table	~50 MB	~500 unique key-value pairs, ~100 bytes each
`resource_label` table	~1 GB	20M rows (1M resources × 20 labels), ~50 bytes each
Indexes on resource_label	~500 MB	B-tree indexes on resource_id, label_id
Total	~1.5 GB	Plus duplicated string storage in label table

Problem: Strings like "pipelinesascode.tekton.dev" stored multiple times as part of different key-value pairs.

Full Normalization Storage (4-Table Design - Proposed Solution)

Estimated for 1M PipelineRuns with 20 labels each:

Table	Rows	Size per Row	Total Size	Notes
`metadata_key`	~500	~120 bytes	~60 KB	Keys like `app.kubernetes.io/managed-by` + timestamps
`metadata_value`	~50K	~70 bytes	~3.5 MB	Values like `pipelinesascode.tekton.dev`, `app-1` + timestamps
`metadata_pair`	~100K	~48 bytes	~4.8 MB	Unique key-value pairs (2 FKs + 2 timestamps)
`resource_label`	20M	~48 bytes	~960 MB	1M resources × 20 labels (2 FKs + 2 timestamps, down from 64 bytes)
Indexes on metadata_key	-	-	~50 KB	Covered by UNIQUE constraint
Indexes on metadata_value	-	-	~2 MB	Covered by UNIQUE constraint
Indexes on metadata_pair	-	-	~3 MB	UNIQUE(key_id, value_id)
Indexes on resource_label	-	-	~400 MB	unique(resource_id, pair_id), idx(pair_id)
Total (tables)			~970 MB
Total (indexes)			~405 MB
Grand Total			~1.375 GB

Storage Comparison

Approach	Total Storage	Notes
Current (JSONB only)	2.5 GB	2 GB data + 500 MB GIN index
Partial normalization	~2 GB	Duplicates values across key-value pairs
Full normalization (4-table)	1.375 GB	Each string AND each pair stored once

Savings:

45% smaller than JSONB approach
31% smaller than partial normalization
21% smaller than 3-table design (saves ~375 MB by deduplicating pairs)
Benefit: Maximum deduplication with label_pair table

Critical benefit as scale increases:

10M resources: ~11 GB savings vs JSONB
100M resources: ~110 GB savings vs JSONB

vs. keeping labels in JSONB + duplicating in normalized tables:

JSONB labels: 2 GB
Normalized tables: 1.375 GB
Total with duplication: 3.375 GB (if both are kept)

Data Migration Strategy

📦 Phase 1: Create New Tables (Click to expand SQL)

-- Migration: 05_label_normalization_full.up.sql

BEGIN;

-- Create metadata_key table (generic for labels and annotations)
CREATE TABLE public.metadata_key (
    id BIGSERIAL PRIMARY KEY,
    key TEXT NOT NULL,
    created_at timestamp with time zone DEFAULT now() NOT NULL,
    updated_at timestamp with time zone DEFAULT now() NOT NULL,
    CONSTRAINT unique_metadata_key UNIQUE (key)
);

-- Create metadata_value table (generic for labels and annotations)
CREATE TABLE public.metadata_value (
    id BIGSERIAL PRIMARY KEY,
    value TEXT NOT NULL,
    created_at timestamp with time zone DEFAULT now() NOT NULL,
    updated_at timestamp with time zone DEFAULT now() NOT NULL,
    CONSTRAINT unique_metadata_value UNIQUE (value)
);

-- Create metadata_pair table to store unique key-value combinations
CREATE TABLE public.metadata_pair (
    id BIGSERIAL PRIMARY KEY,
    key_id BIGINT NOT NULL REFERENCES metadata_key(id) ON DELETE CASCADE,
    value_id BIGINT NOT NULL REFERENCES metadata_value(id) ON DELETE CASCADE,
    created_at timestamp with time zone DEFAULT now() NOT NULL,
    updated_at timestamp with time zone DEFAULT now() NOT NULL,
    CONSTRAINT unique_metadata_pair UNIQUE (key_id, value_id)
);

-- Create resource_label junction table to link resources to label pairs
CREATE TABLE public.resource_label (
    id BIGSERIAL PRIMARY KEY,
    resource_id BIGINT NOT NULL REFERENCES resource(id) ON DELETE CASCADE,
    pair_id BIGINT NOT NULL REFERENCES metadata_pair(id) ON DELETE CASCADE,
    created_at timestamp with time zone DEFAULT now() NOT NULL,
    updated_at timestamp with time zone DEFAULT now() NOT NULL,
    CONSTRAINT unique_resource_label UNIQUE (resource_id, pair_id)
);

-- Create index for label lookups (get all resources with a specific label pair)
CREATE INDEX idx_resource_label_pair ON resource_label(pair_id);

COMMIT;

Note on updated_at columns:
All tables include an updated_at timestamp column that should be automatically updated whenever a row is modified. We'll add a trigger function to handle this in Phase 1b below.

🔄 Phase 1b: Create Trigger for Automatic `updated_at` Updates (Click to expand SQL)

-- Migration: 05_label_normalization_full_updated_at.up.sql
-- Create a generic trigger function to update the updated_at column

BEGIN;

-- Create reusable trigger function for updating updated_at
CREATE OR REPLACE FUNCTION update_updated_at_column()
RETURNS TRIGGER AS $$
BEGIN
    NEW.updated_at = now();
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

-- Apply trigger to all new tables
CREATE TRIGGER update_label_key_updated_at
    BEFORE UPDATE ON label_key
    FOR EACH ROW
    EXECUTE FUNCTION update_updated_at_column();

CREATE TRIGGER update_label_value_updated_at
    BEFORE UPDATE ON label_value
    FOR EACH ROW
    EXECUTE FUNCTION update_updated_at_column();

CREATE TRIGGER update_label_pair_updated_at
    BEFORE UPDATE ON label_pair
    FOR EACH ROW
    EXECUTE FUNCTION update_updated_at_column();

CREATE TRIGGER update_resource_label_updated_at
    BEFORE UPDATE ON resource_label
    FOR EACH ROW
    EXECUTE FUNCTION update_updated_at_column();

COMMIT;

Note: These triggers will automatically update the updated_at timestamp whenever any row in these tables is modified via UPDATE operations. This ensures accurate tracking of when each record was last modified.

📥 Phase 2: Populate Tables from Existing Data (Click to expand SQL)

-- Migration: 05_label_normalization_full_populate.sql
-- Run this AFTER creating tables

BEGIN;

-- Step 1: Extract all unique label keys
INSERT INTO label_key (key)
SELECT DISTINCT kv.key
FROM resource r,
     LATERAL jsonb_each_text(r.data->'metadata'->'labels') AS kv(key, value)
WHERE r.data->'metadata'->'labels' IS NOT NULL
ON CONFLICT (key) DO NOTHING;

-- Step 2: Extract all unique label values
INSERT INTO label_value (value)
SELECT DISTINCT kv.value
FROM resource r,
     LATERAL jsonb_each_text(r.data->'metadata'->'labels') AS kv(key, value)
WHERE r.data->'metadata'->'labels' IS NOT NULL
ON CONFLICT (value) DO NOTHING;

-- Step 3: Create unique label pairs (key-value combinations)
INSERT INTO label_pair (key_id, value_id)
SELECT DISTINCT lk.id, lv.id
FROM resource r,
     LATERAL jsonb_each_text(r.data->'metadata'->'labels') AS kv(key, value)
INNER JOIN label_key lk ON lk.key = kv.key
INNER JOIN label_value lv ON lv.value = kv.value
WHERE r.data->'metadata'->'labels' IS NOT NULL
ON CONFLICT (key_id, value_id) DO NOTHING;

-- Step 4: Create resource-label associations using pair_id
INSERT INTO resource_label (resource_id, pair_id)
SELECT DISTINCT
    r.id,
    lp.id
FROM resource r,
     LATERAL jsonb_each_text(r.data->'metadata'->'labels') AS kv(key, value)
INNER JOIN label_key lk ON lk.key = kv.key
INNER JOIN label_value lv ON lv.value = kv.value
INNER JOIN label_pair lp ON lp.key_id = lk.id AND lp.value_id = lv.id
WHERE r.data->'metadata'->'labels' IS NOT NULL
ON CONFLICT (resource_id, pair_id) DO NOTHING;

COMMIT;

Note: For large databases (2M+ resources), run this in batches:

-- Batch processing for large tables
DO $$
DECLARE
    batch_size INTEGER := 10000;
    max_id BIGINT;
    current_id BIGINT := 0;
BEGIN
    SELECT MAX(id) INTO max_id FROM resource;

    WHILE current_id < max_id LOOP
        -- Process batch: keys
        INSERT INTO label_key (key)
        SELECT DISTINCT kv.key
        FROM resource r,
             LATERAL jsonb_each_text(r.data->'metadata'->'labels') AS kv(key, value)
        WHERE r.id > current_id AND r.id <= current_id + batch_size
          AND r.data->'metadata'->'labels' IS NOT NULL
        ON CONFLICT (key) DO NOTHING;

        -- Process batch: values
        INSERT INTO label_value (value)
        SELECT DISTINCT kv.value
        FROM resource r,
             LATERAL jsonb_each_text(r.data->'metadata'->'labels') AS kv(key, value)
        WHERE r.id > current_id AND r.id <= current_id + batch_size
          AND r.data->'metadata'->'labels' IS NOT NULL
        ON CONFLICT (value) DO NOTHING;

        -- Process batch: label pairs
        INSERT INTO label_pair (key_id, value_id)
        SELECT DISTINCT lk.id, lv.id
        FROM resource r,
             LATERAL jsonb_each_text(r.data->'metadata'->'labels') AS kv(key, value)
        INNER JOIN label_key lk ON lk.key = kv.key
        INNER JOIN label_value lv ON lv.value = kv.value
        WHERE r.id > current_id AND r.id <= current_id + batch_size
          AND r.data->'metadata'->'labels' IS NOT NULL
        ON CONFLICT (key_id, value_id) DO NOTHING;

        -- Process batch: resource associations using pair_id
        INSERT INTO resource_label (resource_id, pair_id)
        SELECT DISTINCT r.id, lp.id
        FROM resource r,
             LATERAL jsonb_each_text(r.data->'metadata'->'labels') AS kv(key, value)
        INNER JOIN label_key lk ON lk.key = kv.key
        INNER JOIN label_value lv ON lv.value = kv.value
        INNER JOIN label_pair lp ON lp.key_id = lk.id AND lp.value_id = lv.id
        WHERE r.id > current_id AND r.id <= current_id + batch_size
          AND r.data->'metadata'->'labels' IS NOT NULL
        ON CONFLICT (resource_id, pair_id) DO NOTHING;

        current_id := current_id + batch_size;
        RAISE NOTICE 'Processed up to ID %', current_id;

        COMMIT;
    END LOOP;
END $$;

⚙️ Phase 3: Modify Write Path with Database Trigger (Click to expand SQL)

Recommended: Use database trigger with WHEN clause to automatically sync labels only when they change.

-- Step 1: Create trigger function for full normalization with label_pair
CREATE OR REPLACE FUNCTION sync_labels_to_relational_tables_full()
RETURNS TRIGGER AS $$
DECLARE
    label_record RECORD;
    current_key_id BIGINT;
    current_value_id BIGINT;
    current_pair_id BIGINT;
BEGIN
    IF (TG_OP = 'INSERT' OR TG_OP = 'UPDATE') THEN
        -- Delete old label associations (for UPDATE case)
        DELETE FROM resource_label WHERE resource_id = NEW.id;

        -- Extract labels from JSONB and sync to relational tables
        IF NEW.data->'metadata'->'labels' IS NOT NULL THEN
            FOR label_record IN
                SELECT key, value
                FROM jsonb_each_text(NEW.data->'metadata'->'labels')
            LOOP
                -- Insert or get label_key
                INSERT INTO label_key (key)
                VALUES (label_record.key)
                ON CONFLICT (key) DO NOTHING;

                SELECT id INTO current_key_id
                FROM label_key
                WHERE key = label_record.key;

                -- Insert or get label_value
                INSERT INTO label_value (value)
                VALUES (label_record.value)
                ON CONFLICT (value) DO NOTHING;

                SELECT id INTO current_value_id
                FROM label_value
                WHERE value = label_record.value;

                -- Insert or get label_pair (key-value combination)
                INSERT INTO label_pair (key_id, value_id)
                VALUES (current_key_id, current_value_id)
                ON CONFLICT (key_id, value_id) DO NOTHING;

                SELECT id INTO current_pair_id
                FROM label_pair
                WHERE key_id = current_key_id AND value_id = current_value_id;

                -- Create resource-label association using pair_id
                INSERT INTO resource_label (resource_id, pair_id)
                VALUES (NEW.id, current_pair_id)
                ON CONFLICT (resource_id, pair_id) DO NOTHING;
            END LOOP;
        END IF;

        RETURN NEW;
    END IF;

    RETURN NULL;
END;
$$ LANGUAGE plpgsql;

-- Step 2: Create trigger with WHEN clause (only fires when labels change)
CREATE TRIGGER trigger_sync_labels_full
    AFTER INSERT OR UPDATE OF data ON resource
    FOR EACH ROW
    WHEN (
        -- Always fire on INSERT (new resources need labels synced)
        (TG_OP = 'INSERT' AND NEW.data->'metadata'->'labels' IS NOT NULL)

        OR

        -- Only fire on UPDATE if labels changed (avoids 90%+ of updates!)
        (TG_OP = 'UPDATE' AND
         NEW.data->'metadata'->'labels' IS DISTINCT FROM OLD.data->'metadata'->'labels')
    )
    EXECUTE FUNCTION sync_labels_to_relational_tables_full();

Performance Impact:

Update Type	Frequency	Overhead	Explanation
Status-only update	90-95%	0ms ✅	Trigger doesn't fire (WHEN clause false)
Label changed	5-10%	~10-20ms	Trigger executes sync logic (simpler than 4-table design)
Average	-	~0.5-1ms	Weighted average

Testing the Trigger:

-- Test 1: INSERT (should sync labels to 3 tables)
INSERT INTO resource (uuid, kind, api_version, namespace, name, resource_version, cluster_updated_ts, data)
VALUES (
    gen_random_uuid(), 'Pod', 'v1', 'default', 'test-pod-1', '12345', now(),
    '{"metadata":{"labels":{"app.example.com/name":"test","env":"dev"}}}'::jsonb
);

-- Verify: Check all 3 tables
SELECT COUNT(*) FROM label_key WHERE key IN ('app.example.com/name', 'env');
-- Expected: 2

SELECT COUNT(*) FROM label_value WHERE value IN ('test', 'dev');
-- Expected: 2

SELECT COUNT(*) FROM resource_label WHERE resource_id = (SELECT id FROM resource WHERE name = 'test-pod-1');
-- Expected: 2

-- Verify the associations are correct
SELECT lk.key, lv.value
FROM resource_label rl
INNER JOIN label_key lk ON lk.id = rl.key_id
INNER JOIN label_value lv ON lv.id = rl.value_id
WHERE rl.resource_id = (SELECT id FROM resource WHERE name = 'test-pod-1');
-- Expected: 2 rows (app.example.com/name=test, env=dev)

-- Test 2: UPDATE status only (should NOT sync)
UPDATE resource
SET data = jsonb_set(data, '{status}', '{"phase":"Running"}')
WHERE name = 'test-pod-1';

-- Verify trigger didn't fire: label counts should still be 2
SELECT COUNT(*) FROM resource_label WHERE resource_id = (SELECT id FROM resource WHERE name = 'test-pod-1');
-- Expected: 2

-- Test 3: UPDATE labels (should sync)
UPDATE resource
SET data = jsonb_set(data, '{metadata,labels}', '{"app.example.com/name":"test","version":"2.0"}')
WHERE name = 'test-pod-1';

-- Verify: Should now have app.example.com/name=test and version=2.0
SELECT lk.key, lv.value
FROM resource_label rl
INNER JOIN label_key lk ON lk.id = rl.key_id
INNER JOIN label_value lv ON lv.id = rl.value_id
WHERE rl.resource_id = (SELECT id FROM resource WHERE name = 'test-pod-1');
-- Expected: 2 rows (app.example.com/name=test, version=2.0)

📖 Phase 4: Application Read Code Examples (Click to expand)

Architecture Overview

KubeArchive uses a database-agnostic architecture:

reader.go: Database-agnostic business logic
<vendor>.go: Vendor-specific SQL building (PostgreSQL, MySQL, etc.)
go-sqlbuilder: Library for building SQL queries programmatically
Facades/Filters: Abstraction layer for query building

To support normalized label queries, we need to:

Extend the facade pattern to support JOINs with normalized tables
Add label-specific filters to the existing filter architecture
Keep reader.go vendor-agnostic

Step 1: Extend the Facade Interface

// pkg/database/facade.go

// QueryFacade abstracts database-specific query building
type QueryFacade interface {
    // Existing methods...
    ApplyFilters(filters []Filter) QueryFacade
    ApplyOrdering(orderBy []string) QueryFacade
    ApplyLimit(limit int) QueryFacade

    // New: Support for JOINs with normalized label tables
    JoinLabelTables() QueryFacade
    ApplyLabelFilters(labelSelectors map[string]string) QueryFacade

    Build() (string, []interface{})
}

// Filter interface for building WHERE clauses
type Filter interface {
    Apply(builder *sqlbuilder.SelectBuilder) *sqlbuilder.SelectBuilder
}

Step 2: Implement PostgreSQL-specific Label Facade

// pkg/database/postgresql.go

import (
    "github.com/huandu/go-sqlbuilder"
)

type PostgreSQLQueryFacade struct {
    sb              *sqlbuilder.SelectBuilder
    flavor          sqlbuilder.Flavor
    labelJoinAdded  bool
}

func NewPostgreSQLQueryFacade() *PostgreSQLQueryFacade {
    sb := sqlbuilder.PostgreSQL.NewSelectBuilder()
    sb.Select("r.id", "r.uuid", "r.data")
    sb.From("resource r")

    return &PostgreSQLQueryFacade{
        sb:             sb,
        flavor:         sqlbuilder.PostgreSQL,
        labelJoinAdded: false,
    }
}

// JoinLabelTables adds JOINs to normalized label tables
func (f *PostgreSQLQueryFacade) JoinLabelTables() QueryFacade {
    if !f.labelJoinAdded {
        // Join with resource_label, label_key, and label_value
        // Using a subquery approach to avoid Cartesian product with multiple labels
        f.labelJoinAdded = true
    }
    return f
}

// ApplyLabelFilters adds WHERE conditions for label selectors
func (f *PostgreSQLQueryFacade) ApplyLabelFilters(labelSelectors map[string]string) QueryFacade {
    if len(labelSelectors) == 0 {
        return f
    }

    // Build subquery: find resources that have ALL specified labels
    // Using CTEs for clarity and performance

    // Create CTE for target labels
    labelPairs := make([]string, 0, len(labelSelectors))
    var args []interface{}

    for key, value := range labelSelectors {
        labelPairs = append(labelPairs, f.sb.Var(key)+" AS key, "+f.sb.Var(value)+" AS value")
        args = append(args, key, value)
    }

    // Build the label filter subquery
    // This finds resources that have ALL labels in the selector
    subquery := f.flavor.NewSelectBuilder()
    subquery.Select("rl.resource_id")
    subquery.From("resource_label rl")
    subquery.Join("label_key lk", "lk.id = rl.key_id")
    subquery.Join("label_value lv", "lv.id = rl.value_id")

    // Build OR conditions for each label pair
    var labelConditions []string
    for key, value := range labelSelectors {
        labelConditions = append(labelConditions,
            subquery.And(
                subquery.Equal("lk.key", key),
                subquery.Equal("lv.value", value),
            ),
        )
    }

    subquery.Where(subquery.Or(labelConditions...))
    subquery.GroupBy("rl.resource_id")
    subquery.Having(subquery.Equal("COUNT(*)", len(labelSelectors)))

    // Add subquery to main query
    subquerySQL, subqueryArgs := subquery.Build()
    f.sb.Where(f.sb.In("r.id", subquerySQL))

    return f
}

// ApplyFilters applies standard filters (kind, namespace, etc.)
func (f *PostgreSQLQueryFacade) ApplyFilters(filters []Filter) QueryFacade {
    for _, filter := range filters {
        filter.Apply(f.sb)
    }
    return f
}

// ApplyOrdering adds ORDER BY clause
func (f *PostgreSQLQueryFacade) ApplyOrdering(orderBy []string) QueryFacade {
    if len(orderBy) > 0 {
        f.sb.OrderBy(orderBy...)
    }
    return f
}

// ApplyLimit adds LIMIT clause
func (f *PostgreSQLQueryFacade) ApplyLimit(limit int) QueryFacade {
    if limit > 0 {
        f.sb.Limit(limit)
    }
    return f
}

// Build returns the final SQL query and arguments
func (f *PostgreSQLQueryFacade) Build() (string, []interface{}) {
    return f.sb.Build()
}

Step 3: Create Label-Specific Filters

// pkg/database/filters.go

// KindFilter filters by resource kind
type KindFilter struct {
    Kind string
}

func (f KindFilter) Apply(sb *sqlbuilder.SelectBuilder) *sqlbuilder.SelectBuilder {
    return sb.Where(sb.Equal("r.kind", f.Kind))
}

// NamespaceFilter filters by namespace
type NamespaceFilter struct {
    Namespace string
}

func (f NamespaceFilter) Apply(sb *sqlbuilder.SelectBuilder) *sqlbuilder.SelectBuilder {
    return sb.Where(sb.Equal("r.namespace", f.Namespace))
}

// ApiVersionFilter filters by API version
type ApiVersionFilter struct {
    ApiVersion string
}

func (f ApiVersionFilter) Apply(sb *sqlbuilder.SelectBuilder) *sqlbuilder.SelectBuilder {
    return sb.Where(sb.Equal("r.api_version", f.ApiVersion))
}

// CreatedAfterFilter filters by creation timestamp
type CreatedAfterFilter struct {
    Timestamp string
}

func (f CreatedAfterFilter) Apply(sb *sqlbuilder.SelectBuilder) *sqlbuilder.SelectBuilder {
    return sb.Where(sb.GreaterThan("r.data->'metadata'->>'creationTimestamp'", f.Timestamp))
}

Step 4: Database-Agnostic Reader Code

// pkg/database/reader.go

type Reader struct {
    facade QueryFacade
}

// QueryResourcesByLabels queries resources with label selectors (database-agnostic)
func (r *Reader) QueryResourcesByLabels(
    ctx context.Context,
    kind, apiVersion, namespace string,
    labelSelectors map[string]string,
    limit int,
) ([]Resource, error) {
    // Build filters
    filters := []Filter{
        KindFilter{Kind: kind},
        ApiVersionFilter{ApiVersion: apiVersion},
        NamespaceFilter{Namespace: namespace},
    }

    // Build query using facade pattern
    query, args := r.facade.
        ApplyFilters(filters).
        ApplyLabelFilters(labelSelectors).
        ApplyOrdering([]string{
            "r.data->'metadata'->>'creationTimestamp' DESC",
            "r.id DESC",
        }).
        ApplyLimit(limit).
        Build()

    // Execute query (database-agnostic)
    rows, err := r.db.QueryContext(ctx, query, args...)
    if err != nil {
        return nil, err
    }
    defer rows.Close()

    var resources []Resource
    for rows.Next() {
        var res Resource
        if err := rows.Scan(&res.ID, &res.UUID, &res.Data); err != nil {
            return nil, err
        }
        resources = append(resources, res)
    }

    return resources, rows.Err()
}

Step 5: Optimized Single-Label Query (Using CTE)

For better performance with single labels, use a CTE-based approach:

// pkg/database/postgresql.go

// ApplyLabelFilterOptimized uses CTEs for single-label queries (faster)
func (f *PostgreSQLQueryFacade) ApplyLabelFilterOptimized(labelKey, labelValue string) QueryFacade {
    // Build CTE for finding the specific label
    cte := f.flavor.NewSelectBuilder()
    cte.Select("rl.resource_id")
    cte.From(
        "(SELECT lk.id as key_id FROM label_key lk WHERE lk.key = " + f.sb.Var(labelKey) + ") tk",
        "(SELECT lv.id as value_id FROM label_value lv WHERE lv.value = " + f.sb.Var(labelValue) + ") tv",
        "resource_label rl",
    )
    cte.Where("rl.key_id = tk.key_id AND rl.value_id = tv.value_id")

    cteSQL, cteArgs := cte.Build()

    // Use CTE in main query
    f.sb = f.sb.With("target_resources", cteSQL)
    f.sb.Where(f.sb.In("r.id", sqlbuilder.Buildf("SELECT resource_id FROM target_resources")))

    return f
}

Step 6: Example Usage in API Handler

// pkg/api/handlers/resources.go

func (h *Handler) ListResourcesByLabels(w http.ResponseWriter, r *http.Request) {
    // Parse query parameters
    kind := r.URL.Query().Get("kind")
    namespace := r.URL.Query().Get("namespace")
    labelSelector := r.URL.Query().Get("labelSelector") // e.g., "app=nginx,env=prod"

    // Parse label selectors
    labels, err := parseLabelSelector(labelSelector)
    if err != nil {
        http.Error(w, "Invalid label selector", http.StatusBadRequest)
        return
    }

    // Create database facade (vendor-specific)
    facade := database.NewPostgreSQLQueryFacade()
    reader := &database.Reader{
        facade: facade,
        db:     h.db,
    }

    // Execute query (database-agnostic)
    resources, err := reader.QueryResourcesByLabels(
        r.Context(),
        kind,
        "v1",
        namespace,
        labels,
        30, // limit
    )
    if err != nil {
        http.Error(w, err.Error(), http.StatusInternalServerError)
        return
    }

    json.NewEncoder(w).Encode(resources)
}

func parseLabelSelector(selector string) (map[string]string, error) {
    labels := make(map[string]string)
    if selector == "" {
        return labels, nil
    }

    pairs := strings.Split(selector, ",")
    for _, pair := range pairs {
        kv := strings.SplitN(pair, "=", 2)
        if len(kv) != 2 {
            return nil, fmt.Errorf("invalid label selector: %s", pair)
        }
        labels[kv[0]] = kv[1]
    }

    return labels, nil
}

Benefits of This Architecture

Database-agnostic: reader.go has no vendor-specific SQL
Testable: Can mock QueryFacade interface for unit tests
Extensible: Easy to add MySQL, SQLite, or other database support
Composable: Filters can be mixed and matched
Type-safe: Compile-time checking with go-sqlbuilder
Performance: Uses CTEs and optimized JOINs based on vendor capabilities

⚖️ Write Path: Database Triggers vs. Application-Level Writes (Click to expand)

The proposal includes database triggers for syncing labels to normalized tables. However, there's an alternative: writing to normalized tables directly from the KubeArchive sink.

Option 1: Database Triggers (Proposed in this document)

How it works:

Application writes full resource to resource table (JSONB)
PostgreSQL trigger automatically extracts labels and populates normalized tables
Trigger only fires when labels change (WHEN clause optimization)

Pros:

✅ Simple application code - No changes to sink logic
✅ Guaranteed consistency - Atomically synced in same transaction
✅ Centralized logic - All label sync logic in one place (database)
✅ Can't be bypassed - Any INSERT/UPDATE automatically syncs labels
✅ Easy rollback - Drop trigger to disable, no app changes needed
✅ Works with direct SQL - Even manual DB operations sync labels

Cons:

⚠️ Database coupling - Business logic in database
⚠️ Harder to test - Requires database for unit tests
⚠️ Limited error handling - Trigger failures rollback entire transaction
⚠️ Debugging complexity - Trigger execution not visible in application logs
⚠️ Performance overhead - PL/pgSQL loop overhead for each label

Option 2: Application-Level Writes (Alternative approach)

How it works:

Sink parses labels from Kubernetes resource
Application writes to both resource table AND normalized tables in single transaction
Uses prepared statements and batch inserts

Example implementation:

func (s *Sink) WriteResource(ctx context.Context, resource *Resource) error {
    tx, err := s.db.BeginTx(ctx, nil)
    if err != nil {
        return err
    }
    defer tx.Rollback()

    // 1. Insert/update resource
    var resourceID int64
    err = tx.QueryRowContext(ctx, `
        INSERT INTO resource (uuid, kind, api_version, namespace, name, data, cluster_updated_ts)
        VALUES ($1, $2, $3, $4, $5, $6, $7)
        ON CONFLICT (uuid) DO UPDATE SET
            data = EXCLUDED.data,
            cluster_updated_ts = EXCLUDED.cluster_updated_ts,
            updated_at = now()
        RETURNING id
    `, resource.UUID, resource.Kind, resource.APIVersion, resource.Namespace,
       resource.Name, resource.Data, resource.ClusterUpdatedTS).Scan(&resourceID)
    if err != nil {
        return err
    }

    // 2. Parse labels from resource.Data (JSONB)
    labels := extractLabelsFromJSONB(resource.Data)

    // 3. Delete old label associations
    _, err = tx.ExecContext(ctx, `DELETE FROM resource_label WHERE resource_id = $1`, resourceID)
    if err != nil {
        return err
    }

    // 4. Insert new labels (batch upsert)
    if len(labels) > 0 {
        err = s.upsertLabels(ctx, tx, resourceID, labels)
        if err != nil {
            return err
        }
    }

    return tx.Commit()
}

func (s *Sink) upsertLabels(ctx context.Context, tx *sql.Tx, resourceID int64, labels map[string]string) error {
    // Batch insert keys and values
    for key, value := range labels {
        // Insert or get key_id
        var keyID int64
        err := tx.QueryRowContext(ctx, `
            INSERT INTO label_key (key) VALUES ($1)
            ON CONFLICT (key) DO UPDATE SET key = EXCLUDED.key
            RETURNING id
        `, key).Scan(&keyID)
        if err != nil {
            return err
        }

        // Insert or get value_id
        var valueID int64
        err = tx.QueryRowContext(ctx, `
            INSERT INTO label_value (value) VALUES ($1)
            ON CONFLICT (value) DO UPDATE SET value = EXCLUDED.value
            RETURNING id
        `, value).Scan(&valueID)
        if err != nil {
            return err
        }

        // Insert resource_label association
        _, err = tx.ExecContext(ctx, `
            INSERT INTO resource_label (resource_id, key_id, value_id)
            VALUES ($1, $2, $3)
            ON CONFLICT (resource_id, key_id, value_id) DO NOTHING
        `, resourceID, keyID, valueID)
        if err != nil {
            return err
        }
    }

    return nil
}

Pros:

✅ Explicit control - Application controls exactly when/how labels are synced
✅ Better error handling - Can retry, log, or handle errors gracefully
✅ Testable - Easy to unit test with mocks
✅ Observable - Application logs show label sync operations
✅ Batch optimizations - Can use COPY or multi-row inserts
✅ Conditional sync - Can skip sync if labels unchanged (app-level check)

Cons:

⚠️ More complex code - Sink logic becomes more complicated
⚠️ Risk of inconsistency - If app code has bug, labels might not sync
⚠️ Multiple code paths - Need to handle INSERT, UPDATE, and DELETE differently
⚠️ Potential for drift - Direct DB access bypasses label sync
⚠️ Harder rollback - Requires code deployment to change behavior

Recommendation

Use Database Triggers (Option 1) for this implementation:

Simpler migration - No changes to existing sink code
Lower risk - Guaranteed consistency regardless of application bugs
Easier rollback - Can disable with DROP TRIGGER if issues arise
Optimal change detection - Trigger WHEN clause efficiently checks if labels changed at database level using NEW.data->'metadata'->'labels' IS DISTINCT FROM OLD.data->'metadata'->'labels' without any application overhead

Why the trigger WHEN clause is optimal:

Database has access to both OLD and NEW values in a single atomic operation
No extra reads required - comparison happens during the update
Avoids 90%+ of trigger executions for status-only updates
Application-level change detection would require reading the old resource from the database, deserializing JSONB, extracting old labels, and comparing - this is expensive and defeats the purpose

Consider Application-Level Writes (Option 2) only if:

You need complex error handling or retry logic specific to label sync
Performance profiling shows trigger overhead is significant (unlikely with WHEN clause)
You want to batch label writes separately from resource writes
You need extensive observability into label sync operations beyond database logs

ggallen · 2026-01-19T22:38:52Z

ggallen
Jan 19, 2026
Maintainer

My answers to the above questions:

Yes
Yes.
Yes, and we should do this at the start. I am not convinced of the listed cons, either. I expect this normalization will decrease the index size discussed in number 2. I think this will result in more, but smaller, indexes.
Yes to annotations. Separate tables with the same structure.
No.

0 replies

ggallen · 2026-01-21T17:59:09Z

ggallen
Jan 21, 2026
Maintainer

The metadata from a current PipelineRun:

apiVersion: v1
items:
- apiVersion: tekton.dev/v1
  kind: PipelineRun
  metadata:
    annotations:
      appstudio.openshift.io/snapshot: kubearchive-b4vgl
      build.appstudio.openshift.io/repo: https://github.com/ggallen/kubearchive?rev=eb1befa91218676771a798b14c2931594710538e
      build.appstudio.redhat.com/commit_sha: eb1befa91218676771a798b14c2931594710538e
      build.appstudio.redhat.com/pull_request_number: "141"
      build.appstudio.redhat.com/target_branch: onboard-konflux-ggallen
      chains.tekton.dev/signed: "true"
      pipelinesascode.tekton.dev/branch: onboard-konflux-ggallen
      pipelinesascode.tekton.dev/cancel-in-progress: "true"
      pipelinesascode.tekton.dev/check-run-id: "61044777044"
      pipelinesascode.tekton.dev/controller-info: '{"name":"default","configmap":"pipelines-as-code","secret":"pipelines-as-code-secret",
        "gRepo": "pipelines-as-code"}'
      pipelinesascode.tekton.dev/event-type: pull_request
      pipelinesascode.tekton.dev/git-auth-secret: pac-gitauth-yebany
      pipelinesascode.tekton.dev/git-provider: github
      pipelinesascode.tekton.dev/installation-id: "70846972"
      pipelinesascode.tekton.dev/log-url: https://konflux-ui.apps.kflux-prd-rh03.nnv1.p1.openshiftapps.com/ns/kubearchive-tenant/pipelinerun/sink-on-pull-request-2tmkk
      pipelinesascode.tekton.dev/max-keep-runs: "3"
      pipelinesascode.tekton.dev/on-cel-expression: event == "pull_request" && target_branch
        == "onboard-konflux-ggallen"
      pipelinesascode.tekton.dev/original-prname: sink-on-pull-request
      pipelinesascode.tekton.dev/pull-request: "141"
      pipelinesascode.tekton.dev/repo-url: https://github.com/ggallen/kubearchive
      pipelinesascode.tekton.dev/repository: kubearchive-sink
      pipelinesascode.tekton.dev/scm-reporting-plr-started: "true"
      pipelinesascode.tekton.dev/sender: red-hat-konflux-kflux-prd-rh03[bot]
      pipelinesascode.tekton.dev/sha: eb1befa91218676771a798b14c2931594710538e
      pipelinesascode.tekton.dev/sha-title: Update module sigs.k8s.io/kustomize/kyaml
        to v0.21.0
      pipelinesascode.tekton.dev/sha-url: https://github.com/ggallen/kubearchive/commit/eb1befa91218676771a798b14c2931594710538e
      pipelinesascode.tekton.dev/source-branch: konflux/mintmaker/onboard-konflux-ggallen/sigs.k8s.io-kustomize-kyaml-0.x
      pipelinesascode.tekton.dev/source-repo-url: https://github.com/ggallen/kubearchive
      pipelinesascode.tekton.dev/state: completed
      pipelinesascode.tekton.dev/url-org: ggallen
      pipelinesascode.tekton.dev/url-repository: kubearchive
      results.tekton.dev/record: kubearchive-tenant/results/309352f0-27c3-49a0-8f85-567b1c4c8f5b/records/309352f0-27c3-49a0-8f85-567b1c4c8f5b
      results.tekton.dev/recordSummaryAnnotations: '{"repo":"kubearchive","commit":"eb1befa91218676771a798b14c2931594710538e","eventType":"pull_request","pull_request-id":141}'
      results.tekton.dev/result: kubearchive-tenant/results/309352f0-27c3-49a0-8f85-567b1c4c8f5b
      results.tekton.dev/stored: "true"
      test.appstudio.openshift.io/pr-group: konflux/mintmaker/onboard-konflux-ggallen/sigs.k8s.io-kustomize-kyaml-0.x
      test.appstudio.openshift.io/snapshot-creation-report: BuildPLRInProgress
    creationTimestamp: "2026-01-21T16:48:57Z"
    deletionGracePeriodSeconds: 0
    deletionTimestamp: "2026-01-21T16:58:23Z"
    finalizers:
    - chains.tekton.dev/pipelinerun
    - results.tekton.dev/pipelinerun
    - pipelinesascode.tekton.dev/finalizer
    generateName: sink-on-pull-request-
    generation: 3
    labels:
      app.kubernetes.io/managed-by: pipelinesascode.tekton.dev
      app.kubernetes.io/version: v0.40.0
      appstudio.openshift.io/application: kubearchive
      appstudio.openshift.io/component: sink
      kueue.x-k8s.io/priority-class: konflux-pre-merge-build
      kueue.x-k8s.io/queue-name: pipelines-queue
      pipelines.appstudio.openshift.io/type: build
      pipelinesascode.tekton.dev/cancel-in-progress: "true"
      pipelinesascode.tekton.dev/check-run-id: "61044777044"
      pipelinesascode.tekton.dev/event-type: pull_request
      pipelinesascode.tekton.dev/original-prname: sink-on-pull-request
      pipelinesascode.tekton.dev/pull-request: "141"
      pipelinesascode.tekton.dev/repository: kubearchive-sink
      pipelinesascode.tekton.dev/sha: eb1befa91218676771a798b14c2931594710538e
      pipelinesascode.tekton.dev/state: completed
      pipelinesascode.tekton.dev/url-org: ggallen
      pipelinesascode.tekton.dev/url-repository: kubearchive
      pipelineservice.appstudio.io/throttled: sink-on-pull-request-2tmkk-sast-snyk-check
      tekton.dev/pipeline: sink-on-pull-request-2tmkk
      test.appstudio.openshift.io/pr-group-sha: 7ca993d30fa4d3e052ac183c9c3223fa5477d6259295f0cc18b4fa75d962c3
    name: sink-on-pull-request-2tmkk
    namespace: kubearchive-tenant
    resourceVersion: "993195692"
    uid: 309352f0-27c3-49a0-8f85-567b1c4c8f5b

Note that some of the label and annotation values will likely be the same for either every run or very large numbers of runs, for example, app.kubernetes.io/managed-by: pipelinesascode.tekton.dev. This is going to explode the size if the tables if we don't fully normalize them, which will in turn explode the size of the indexes.

3 replies

maruiz93 Jan 22, 2026
Maintainer Author

yes, that's a really good point

maruiz93 Feb 2, 2026
Maintainer Author

Updated the discussion based on this

maruiz93 Feb 4, 2026
Maintainer Author

re-edited I missed one table to be full normalized

maruiz93 · 2026-02-04T10:30:38Z

maruiz93
Feb 4, 2026
Maintainer Author

To be reflected somewhere I wanted to add in this discussion another alternative that was contemplated and discarded to enhance read operations performance: business logic specific index and adapting queries based on that.

Here is how the proposal would look like:

Indexed Label/Annotation Paths Configuration Proposal

When using equality operators instead of containment operators with EXISTS subqueries:

-- Slow (7 minutes): GIN index with @> operator
WHERE data->'metadata'->'labels' @> '{"app.example.com/name":"app-1"}'

-- Fast (9ms): BTREE index with equality operator
WHERE EXISTS (
  SELECT 1 FROM jsonb_each_text(r.data->'metadata'->'labels') l
  WHERE l.key = 'app.example.com/name' AND l.value = 'app-1'
)

The subquery approach works but:

Doesn't leverage dedicated indexes (scans all labels for each row)
Performance degrades with more label filters
Still not optimal for production workloads

Proposed Solution

Create dedicated BTREE indexes for frequently queried, low-selectivity label and annotation keys, and provide a configuration mechanism to inform the API which paths have dedicated indexes.

Core Components

Dedicated BTREE Indexes: Database indexes on specific label/annotation keys
Configuration Interface: ConfigMap-based configuration for cluster administrators
Query Transformation Logic: API logic to use equality operators for indexed paths

Query Transformation Examples

Before (Current Implementation)

SELECT data->'metadata'->>'creationTimestamp' as created_at, id, uuid, data
FROM resource
WHERE kind = 'PipelineRun'
  AND api_version = 'tekton.dev/v1'
  AND namespace = 'tenant-a'
  AND data->'metadata'->>'creationTimestamp' > '2025-10-15T08:33:08Z'
  AND data->'metadata'->'labels' @> '{"app.example.com/name":"app-1", "pipeline.example.com/type":"build"}'
ORDER BY data->'metadata'->>'creationTimestamp' DESC, id DESC
LIMIT 30;

Performance: ~6 minutes (BitmapAnd with 335,501 rows rechecked)

After (With Indexed Paths)

SELECT data->'metadata'->>'creationTimestamp' as created_at, id, uuid, data
FROM resource
WHERE kind = 'PipelineRun'
  AND api_version = 'tekton.dev/v1'
  AND namespace = 'tenant-a'
  AND data->'metadata'->>'creationTimestamp' > '2025-10-15T08:33:08Z'
  AND data->'metadata'->'labels'->>'app.example.com/name' = 'app-1'
  AND data->'metadata'->'labels'->>'pipeline.example.com/type' = 'build'
ORDER BY data->'metadata'->>'creationTimestamp' DESC, id DESC
LIMIT 30;

Expected Performance: <100ms (Index scan on idx_label_application with additional filter)

Why this was discarded:

This solutions is less maintainable and scalable. When the users starts facing the performance degradation, a new index need to be created, which will impact the write performance because using indexes comes always with a trade-off.
The code gets complex as the queries will be different based on the existance of indexes or not
I's error-prone, if and index is in place but the information isn't passed through the configmap, or it is with any error, the performance won't improve, and same happens otherwise, when removing an index.
Indexes are not free, they use extra space.

0 replies

kubearchive

Label Normalization Proposal #1597

Uh oh!

Uh oh!

maruiz93 Jan 16, 2026 Maintainer

Label Normalization Proposal - Full Normalization Schema

Executive Summary

Example Query Performance

How PostgreSQL Executes These Queries

The work_mem Limitation

Why Non-Selective Labels Cause Problems

Real-World Example: Query Plan Analysis

Solutions Comparison

Schema Design

Full Normalization: Four-Table Design

Table Definitions

Table 1: resource (Existing - No Changes)

Table 2: metadata_key (New - Stores Unique Metadata Keys)

Table 3: metadata_value (New - Stores Unique Metadata Values)

Table 4: metadata_pair (New - Stores Unique Key-Value Pairs)

Table 5: resource_label (New - Links Resources to Label Pairs)

Scenario: 3 PipelineRuns with Common Labels

Table: resource

Table: metadata_key (Unique Keys)

Table: metadata_value (Unique Values)

Table: metadata_pair (Unique Key-Value Pairs)

Table: resource_label (Resource-Label Associations)

Query Examples

Query 1: Find All Resources with app.example.com/name=app-1

Query 2: Find Resources with Multiple Labels

Query 3: Get All Labels for a Resource

Query 4: Analytics - Most Common Label Values

The Storage Problem: Value Duplication

Current Storage (JSONB Only)

Partial Normalization Storage

Full Normalization Storage (4-Table Design - Proposed Solution)

Storage Comparison

Data Migration Strategy

Architecture Overview

Step 1: Extend the Facade Interface

Step 2: Implement PostgreSQL-specific Label Facade

Step 3: Create Label-Specific Filters

Step 4: Database-Agnostic Reader Code

Step 5: Optimized Single-Label Query (Using CTE)

Step 6: Example Usage in API Handler

Benefits of This Architecture

Option 1: Database Triggers (Proposed in this document)

Option 2: Application-Level Writes (Alternative approach)

Recommendation

Replies: 3 comments · 3 replies

Uh oh!

Uh oh!

ggallen Jan 19, 2026 Maintainer

Uh oh!

ggallen Jan 21, 2026 Maintainer

Uh oh!

maruiz93 Jan 22, 2026 Maintainer Author

Uh oh!

maruiz93 Feb 2, 2026 Maintainer Author

Uh oh!

maruiz93 Feb 4, 2026 Maintainer Author

Uh oh!

maruiz93 Feb 4, 2026 Maintainer Author

Indexed Label/Annotation Paths Configuration Proposal

Proposed Solution

Core Components

Query Transformation Examples

Before (Current Implementation)

After (With Indexed Paths)

Why this was discarded:

maruiz93
Jan 16, 2026
Maintainer

Table 1: `resource` (Existing - No Changes)

Table 2: `metadata_key` (New - Stores Unique Metadata Keys)

Table 3: `metadata_value` (New - Stores Unique Metadata Values)

Table 4: `metadata_pair` (New - Stores Unique Key-Value Pairs)

Table 5: `resource_label` (New - Links Resources to Label Pairs)

Table: `resource`

Table: `metadata_key` (Unique Keys)

Table: `metadata_value` (Unique Values)

Table: `metadata_pair` (Unique Key-Value Pairs)

Table: `resource_label` (Resource-Label Associations)

Query 1: Find All Resources with `app.example.com/name=app-1`

Replies: 3 comments 3 replies

ggallen
Jan 19, 2026
Maintainer

ggallen
Jan 21, 2026
Maintainer

maruiz93 Jan 22, 2026
Maintainer Author

maruiz93 Feb 2, 2026
Maintainer Author

maruiz93 Feb 4, 2026
Maintainer Author

maruiz93
Feb 4, 2026
Maintainer Author