Skip to content

Simplify PgSTAC Bootstrap Process #208

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 42 commits into from
Apr 17, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
02cfa59
feat: Add PgSTAC settings SQL file and update bootstrap process
emmanuelmathot Apr 9, 2025
c1024bf
feat: Add EOAPI Helm Chart README and streamline PgSTAC bootstrap pro…
emmanuelmathot Apr 9, 2025
81df848
feat: Enhance debugging output for failed service boot and test failu…
emmanuelmathot Apr 9, 2025
67b85dc
fix: reduce timeout for service boot watch to 2 minutes
emmanuelmathot Apr 9, 2025
673dba2
feat: update pgstac image name to use pgstac-pypgstac variant
emmanuelmathot Apr 9, 2025
3868757
fix: update log extraction for pgstac-migrate pod in CI workflow
emmanuelmathot Apr 9, 2025
c8536ab
fix: update wait command to reference pgstac-migrate job in deploymen…
emmanuelmathot Apr 9, 2025
820d499
fix: disable continue-on-error for test runner step in CI workflow
emmanuelmathot Apr 9, 2025
4d4387b
fix: enhance pgstac-migrate job monitoring and increase timeout for s…
emmanuelmathot Apr 9, 2025
84cc3c3
fix: add log extraction for pgstac-migrate pod in CI workflow
emmanuelmathot Apr 9, 2025
bc883d7
fix: add error handling to pgstac-migrate job script
emmanuelmathot Apr 9, 2025
817f748
fix: improve database readiness check and add error handling in pgsta…
emmanuelmathot Apr 9, 2025
11e3184
fix: add debugging output for environment variables in pgstac-migrate…
emmanuelmathot Apr 9, 2025
ecb1ff8
fix: update postgres user privileges and clarify comments in values.yaml
emmanuelmathot Apr 9, 2025
5d786b7
fix: configure Patroni for schema permissions and update PostgreSQL s…
emmanuelmathot Apr 9, 2025
637dd7f
fix: update schema permissions for eoapi user in values.yaml
emmanuelmathot Apr 9, 2025
41af2a2
fix: update PostgreSQL schema privileges for eoapi user in values.yaml
emmanuelmathot Apr 9, 2025
4050f72
fix: update postgrescluster options to SUPERUSER for enhanced privileges
emmanuelmathot Apr 9, 2025
4daa0df
fix: update database initialization SQL and modify postgrescluster op…
emmanuelmathot Apr 9, 2025
6b338d8
fix: simplify database initialization SQL configuration in values.yaml
emmanuelmathot Apr 9, 2025
3940b88
fix: correct database options formatting in values.yaml
emmanuelmathot Apr 9, 2025
b1cf8e8
fix: add CREATE privilege for eoapi user on public schema in initdb.sql
emmanuelmathot Apr 9, 2025
01298f6
fix: update privileges for eoapi user in initdb.sql
emmanuelmathot Apr 9, 2025
0fed24d
fix: update initdb.sql to set default privileges for eoapi user
emmanuelmathot Apr 9, 2025
8ce9093
fix: streamline database initialization and add verification in pgsta…
emmanuelmathot Apr 9, 2025
bfd6c14
fix: enhance database initialization verification in pgstac bootstrap…
emmanuelmathot Apr 9, 2025
df37cab
fix: enhance search_path verification for PgSTAC functionality in dat…
emmanuelmathot Apr 9, 2025
f675e1f
fix: improve database initialization verification by checking eoapi r…
emmanuelmathot Apr 9, 2025
31bf1df
fix: ensure immediate exit on command failure in PgSTAC migration script
emmanuelmathot Apr 9, 2025
255d206
fix: add database owner assignment and remove redundant initializatio…
emmanuelmathot Apr 9, 2025
777266f
fix: streamline PgSTAC migration job by removing redundant PostGIS ex…
emmanuelmathot Apr 9, 2025
6926885
Merge remote-tracking branch 'origin/main' into simplify-migrate
emmanuelmathot Apr 9, 2025
c1e0cac
fix: add debugging steps for pgstac-migrate job failures and allow co…
emmanuelmathot Apr 9, 2025
32012e7
fix: enhance logging extraction for pgstac-migrate job by implementin…
emmanuelmathot Apr 9, 2025
6a41836
fix: update initdb ConfigMap to include role creation and privileges …
emmanuelmathot Apr 9, 2025
0482448
fix: ensure exit status is captured after helm test execution
emmanuelmathot Apr 9, 2025
96e0026
fix: add initdb ConfigMap for PostgreSQL setup with necessary roles a…
emmanuelmathot Apr 9, 2025
778088b
fix: update helm render/install step to include an ID for better trac…
emmanuelmathot Apr 9, 2025
f613bf0
fix: update database connection settings in pgstac bootstrap ConfigMa…
emmanuelmathot Apr 9, 2025
c4c1e28
fix: remove redundant pod log extraction methods for pgstac-migrate
emmanuelmathot Apr 9, 2025
661bba1
fix: remove debugging environment variable prints and unused initCont…
emmanuelmathot Apr 9, 2025
e935c60
fix: remove unnecessary sleep step from helm-tests workflow
emmanuelmathot Apr 9, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 46 additions & 25 deletions .github/workflows/helm-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,9 @@ jobs:
oci://registry.developers.crunchydata.com/crunchydata/pgo \
--version ${{ env.PGO_VERSION }}

- name: helm render/install eoapi templates
- id: helm-render-install-eoapi-templates
name: helm render/install eoapi templates
continue-on-error: true
run: |
export GITSHA='${{github.sha}}'

Expand All @@ -87,18 +89,53 @@ jobs:
-f ./eoapi/test-k3s-unittest-values.yaml \
./eoapi

- name: sleep for 10s seconds while services boot
shell: bash
run: sleep 10s
exit $?

# - name: Setup upterm session
# uses: lhotari/action-upterm@v1
- name: debug pgstac-migrate job failure
if: steps.helm-render-install-eoapi-templates.outcome == 'failure'
continue-on-error: true
run: |
echo "Extracting pgstac-migrate job info and logs for debugging..."

# Get job details
echo "===== pgstac-migrate Job Details ====="
kubectl get job pgstac-migrate -o yaml || echo "Could not get pgstac-migrate job details"

# Get pod details
echo "===== pgstac-migrate Pod Details ====="
kubectl get pods -l app=pgstac-migrate --all-namespaces || echo "Could not find pgstac-migrate pods"

# Extract logs from pgstac-migrate pod(s)
echo "===== pgstac-migrate Pod Logs ====="
echo "Looking for completed pods from pgstac-migrate job..."
COMPLETED_PODS=$(kubectl get pods --selector=job-name=pgstac-migrate --field-selector=status.phase=Succeeded,status.phase=Failed -o jsonpath='{.items[*].metadata.name}' 2>/dev/null)
if [ -n "$COMPLETED_PODS" ]; then
echo "Found completed pods from pgstac-migrate job. Extracting logs from each:"
for POD in $COMPLETED_PODS; do
echo "--- Logs from completed pod $POD ---"
kubectl logs pod/$POD || echo "Could not get logs from pod $POD"
done
else
echo "No completed pods found for pgstac-migrate job"
fi

# Get details about the database pods/services
echo "===== Database Pod/Service Details ====="
# Find database service
kubectl get svc | grep "db" || echo "Could not find database services"
# Find database pods
kubectl get pods | grep "db-" || echo "Could not find database pods"

# Check for any events related to the job or pods
echo "===== Related Kubernetes Events ====="
kubectl get events | grep -E "pgstac|db" || echo "No relevant events found"

- id: watchservices
name: watch services boot
timeout-minutes: 10
timeout-minutes: 3
continue-on-error: true
run: |
# Now wait for services to boot up
while [[ -z "$(kubectl get pod | grep "^raster-$RELEASE_NAME-.*$" | cut -d' ' -f1 | xargs -I{} kubectl logs pod/{} | grep "GET /.*/healthz" | head -n 1)" ]]; do
echo "still waiting for raster service to start..."
sleep 1
Expand All @@ -124,14 +161,6 @@ jobs:
echo "===== Pod Status ====="
kubectl get pods

# Extract logs from database pod
echo "===== Database Pod Logs ====="
kubectl get pod | grep "^db-$RELEASE_NAME" | cut -d' ' -f1 | xargs -I{} kubectl logs pod/{} --tail=100 || echo "Could not get database logs"

# Extract logs from pgstacbootstrap pod
echo "===== PGSTACBootstrap Pod Logs ====="
kubectl get pod | grep "^pgstacbootstrap-$RELEASE_NAME" | cut -d' ' -f1 | xargs -I{} kubectl logs pod/{} --tail=100 || echo "Could not get pgstacbootstrap logs"

# Extract logs from raster pod init container (wait-for-pgstacbootstrap)
echo "===== Raster Pod Init Container Logs (wait-for-pgstacbootstrap) ====="
kubectl get pod | grep "^raster-$RELEASE_NAME" | cut -d' ' -f1 | xargs -I{} kubectl logs pod/{} -c wait-for-pgstacbootstrap --tail=100 || echo "Could not get raster init container logs"
Expand Down Expand Up @@ -161,7 +190,7 @@ jobs:

- name: run the tests
id: testrunner
continue-on-error: true
# continue-on-error: true
run: |
kubectl get svc --all-namespaces
kubectl get ingress --all-namespaces -o jsonpath='{range .items[0]}kubectl describe ingress {.metadata.name} -n {.metadata.namespace}{end}' | sh
Expand Down Expand Up @@ -192,14 +221,6 @@ jobs:
echo "===== Pod Status ====="
kubectl get pods

# Extract logs from database pod
echo "===== Database Pod Logs ====="
kubectl get pod | grep "^db-$RELEASE_NAME" | cut -d' ' -f1 | xargs -I{} kubectl logs pod/{} --tail=100 || echo "Could not get database logs"

# Extract logs from pgstacbootstrap pod
echo "===== PGSTACBootstrap Pod Logs ====="
kubectl get pod | grep "^pgstacbootstrap-$RELEASE_NAME" | cut -d' ' -f1 | xargs -I{} kubectl logs pod/{} --tail=100 || echo "Could not get pgstacbootstrap logs"

# Extract logs from raster pod init container (wait-for-pgstacbootstrap)
echo "===== Raster Pod Init Container Logs (wait-for-pgstacbootstrap) ====="
kubectl get pod | grep "^raster-$RELEASE_NAME" | cut -d' ' -f1 | xargs -I{} kubectl logs pod/{} -c wait-for-pgstacbootstrap --tail=100 || echo "Could not get raster init container logs"
Expand All @@ -225,4 +246,4 @@ jobs:

- name: helm uninstall eoapi templates
run: |
helm uninstall $RELEASE_NAME
helm uninstall $RELEASE_NAME
141 changes: 141 additions & 0 deletions helm-chart/eoapi/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
# EOAPI Helm Chart

This Helm chart deploys the EOAPI (Earth Observation API) stack, which includes STAC API, raster tile services, vector tile services, and a multidimensional data service.

## Overview

The chart sets up:

- A PostgreSQL database with PostGIS and PgSTAC extensions
- STAC API service for metadata discovery and search
- Titiler for raster tile services
- TIPG for vector tile services
- Optional multidimensional data service

## Prerequisites

- Kubernetes 1.16+
- Helm 3.0+
- PV provisioner support in the underlying infrastructure
- CrunchyData Postgres Operator (for the PostgreSQL database)

## Installation

```bash
# Install Postgres Operator first
helm install --set disable_check_for_upgrades=true pgo oci://registry.developers.crunchydata.com/crunchydata/pgo

# Then install eoapi
helm install eoapi ./eoapi
```

## Configuration

The chart can be configured via `values.yaml`. See the chart's `values.yaml` file for all available options and detailed descriptions.

Key configuration sections:

```yaml
# Services to enable
apiServices:
- raster
- stac
- vector
# - multidim (disabled by default)

# Ingress configuration
ingress:
enabled: true
className: "nginx"
# ...

# Database configuration
postgrescluster:
enabled: true
# ...
```

## PgSTAC Bootstrap Process

The chart includes a streamlined process for initializing and setting up the PgSTAC database.

### PgSTAC Bootstrap Overview

The setup process consists of two main jobs:

1. **pgstac-migrate job**: Runs the pypgstac migrate command to initialize the database schema, applies settings, and sets necessary permissions.
2. **pgstac-load-samples job**: (Optional) Loads sample STAC data only when sample loading is enabled.

### Improvements in PgSTAC Bootstrap

- Replaced custom Python script with pypgstac migrate command
- Moved SQL settings to a dedicated SQL file for better maintainability
- Separated sample data loading into an optional job
- Uses standard PostgreSQL environment variables
- Ensures the process remains idempotent for safe re-runs

### PgSTAC Directory Structure

The codebase has been reorganized to separate different types of files:

- `initdb-data/settings/`: Contains configuration settings like the PgSTAC settings SQL file
- `initdb-data/samples/`: Contains sample data files that are loaded only when sample loading is enabled

### PgSTAC Configuration

- Enable/disable the setup process through `pgstacBootstrap.enabled`
- Control sample data loading:
- New approach: `pgstacBootstrap.settings.loadSamples` (recommended)
- Legacy approach: `pgstacBootstrap.settings.envVars.LOAD_FIXTURES` (deprecated)

Example configuration:

```yaml
pgstacBootstrap:
enabled: true
settings:
# General configuration options
loadSamples: true # Set to false to disable sample data loading

resources:
requests:
cpu: "512m"
memory: "1024Mi"
limits:
cpu: "512m"
memory: "1024Mi"
```

## Services

### STAC API

The STAC API service provides a standardized way to search and discover geospatial data.

### Raster Services (Titiler)

Provides dynamic tiling for raster data through the TiTiler implementation.

### Vector Services (TIPG)

Provides vector tile services for PostGIS data through the TIPG implementation.

### Multidimensional Services (Optional)

Provides services for multidimensional data (time series, etc.).

## Persistence

The chart uses PostgreSQL for data persistence. Make sure to configure appropriate storage for production use.

## Upgrading

When upgrading the chart, consider any changes to values.yaml and migrations that might need to be applied.

## Uninstallation

```bash
helm delete eoapi
```

Note that PVs may need to be manually deleted if you want to remove all data.
60 changes: 0 additions & 60 deletions helm-chart/eoapi/initdb-data/pgstac-setup.py

This file was deleted.

12 changes: 12 additions & 0 deletions helm-chart/eoapi/initdb-data/settings/pgstac-settings.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
-- Apply pgstac settings
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emmanuelmathot is there a reference for what this is doing exactly / what settings these are, etc? If these are settings, do we need to make them configurable in values? Am guessing this is something just a bit silly to get things to work, but might be nice to have a comment with a link to what these settings are or so?

Copy link
Contributor Author

@emmanuelmathot emmanuelmathot Apr 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the idea is to have them in the values. I actually want to have a proper mechanism via pypgstac cli with something like pypgstac load settings that may be used not only in this chart. PR to come in pgstac.

DELETE FROM pgstac.pgstac_settings WHERE name = 'context';
INSERT INTO pgstac.pgstac_settings (name, value) VALUES ('context', 'auto');

DELETE FROM pgstac.pgstac_settings WHERE name = 'context_estimated_count';
INSERT INTO pgstac.pgstac_settings (name, value) VALUES ('context_estimated_count', '100000');

DELETE FROM pgstac.pgstac_settings WHERE name = 'context_estimated_cost';
INSERT INTO pgstac.pgstac_settings (name, value) VALUES ('context_estimated_cost', '100000');

DELETE FROM pgstac.pgstac_settings WHERE name = 'context_stats_ttl';
INSERT INTO pgstac.pgstac_settings (name, value) VALUES ('context_stats_ttl', '1 day');
1 change: 0 additions & 1 deletion helm-chart/eoapi/templates/db/configmap.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
{{- if .Values.db.enabled }}
---
apiVersion: v1
kind: ConfigMap
metadata:
Expand Down
Loading