Skip to content

Commit 9604353

Browse files
pantierraj08lue
andauthored
Added ingest.sh script. (#164)
Co-authored-by: Jonas <[email protected]>
1 parent 2258f0b commit 9604353

File tree

4 files changed

+121
-1
lines changed

4 files changed

+121
-1
lines changed

Makefile

+7-1
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ HELM_REPO_URL=https://devseed.com/eoapi-k8s/
55
HELM_CHART_NAME=eoapi/eoapi
66
PGO_CHART_VERSION=5.7.0
77

8-
.PHONY: all deploy minikube help
8+
.PHONY: all deploy minikube ingest help
99

1010
# Default target
1111
all: deploy
@@ -31,8 +31,14 @@ minikube:
3131
@echo "eoAPI is now available at:"
3232
@minikube service ingress-nginx-controller -n ingress-nginx --url | head -n 1
3333

34+
ingest:
35+
@echo "Ingesting STAC collections and items into the database."
36+
@command -v bash >/dev/null 2>&1 || { echo "bash is required but not installed"; exit 1; }
37+
@./ingest.sh || { echo "Ingestion failed."; exit 1; }
38+
3439
help:
3540
@echo "Makefile commands:"
3641
@echo " make deploy - Install eoAPI on a cluster kubectl is connected to."
3742
@echo " make minikube - Install eoAPI on minikube."
43+
@echo " make ingest - Ingest STAC collections and items into the database."
3844
@echo " make help - Show this help message."

README.md

+1
Original file line numberDiff line numberDiff line change
@@ -56,4 +56,5 @@ Instead of using the `make` commands above you can also [manually `helm install`
5656

5757
* Read about [Default Configuration](./docs/configuration.md#default-configuration) and
5858
other [Configuration Options](./docs/configuration.md#additional-options)
59+
* [Manage your data](./docs/manage-data.md) in eoAPI
5960
* Learn about [Autoscaling / Monitoring / Observability](./docs/autoscaling.md)

docs/manage-data.md

+34
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
# Data management
2+
3+
eoAPI-k8s provides a basic data ingestion process that consist of manual operations on the components of the stack.
4+
5+
# Load data
6+
7+
You will have to have STAC records for the collection and items you wish to load (e.g., `collections.json` and `items.json`).
8+
[This repo](https://github.com/vincentsarago/MAXAR_opendata_to_pgstac) contains a few script that may help you to generate sample input data.
9+
10+
## Preshipped bash script
11+
12+
Execute `make ingest` to load data into the eoAPI service - it expects `collections.json` and `items.json` in the current directory.
13+
14+
## Manual steps
15+
16+
In order to add raster data to eoAPI you can load STAC collections and items into the PostgreSQL database using pgSTAC and the tool `pypgstac`.
17+
18+
First, ensure your Kubernetes cluster is running and `kubectl` is configured to access and modify it.
19+
20+
In a second step, you'll have to upload the data into the pod running the raster eoAPI service. You can use the following commands to copy the data:
21+
22+
```bash
23+
kubectl cp collections.json "$NAMESPACE/$EOAPI_POD_RASTER":/tmp/collections.json
24+
kubectl cp items.json "$NAMESPACE/$EOAPI_POD_RASTER":/tmp/items.json
25+
```
26+
Then, bash into the pod or server running the raster eoAPI service, you can use the following commands to load the data:
27+
28+
```bash
29+
#!/bin/bash
30+
apt update -y && apt install python3 python3-pip -y && pip install pypgstac[psycopg]';
31+
pypgstac pgready --dsn $PGADMIN_URI
32+
pypgstac load collections /tmp/collections.json --dsn $PGADMIN_URI --method insert_ignore
33+
pypgstac load items /tmp/items.json --dsn $PGADMIN_URI --method insert_ignore
34+
```

ingest.sh

+79
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
#!/bin/bash
2+
3+
# Default files
4+
DEFAULT_COLLECTIONS_FILE="./collections.json"
5+
DEFAULT_ITEMS_FILE="./items.json"
6+
7+
# Check for provided parameters or use defaults
8+
if [ "$#" -eq 2 ]; then
9+
EOAPI_COLLECTIONS_FILE="$1"
10+
EOAPI_ITEMS_FILE="$2"
11+
else
12+
EOAPI_COLLECTIONS_FILE="$DEFAULT_COLLECTIONS_FILE"
13+
EOAPI_ITEMS_FILE="$DEFAULT_ITEMS_FILE"
14+
echo "No specific files provided. Using defaults:"
15+
echo " Collections file: $EOAPI_COLLECTIONS_FILE"
16+
echo " Items file: $EOAPI_ITEMS_FILE"
17+
fi
18+
19+
# Define namespaces
20+
NAMESPACES=("default" "eoapi", "data-access")
21+
EOAPI_POD_RASTER=""
22+
FOUND_NAMESPACE=""
23+
24+
# Discover the pod name from both namespaces
25+
for NS in "${NAMESPACES[@]}"; do
26+
EOAPI_POD_RASTER=$(kubectl get pods -n "$NS" -l app=raster-eoapi -o jsonpath="{.items[0].metadata.name}" 2>/dev/null)
27+
if [ -n "$EOAPI_POD_RASTER" ]; then
28+
FOUND_NAMESPACE="$NS"
29+
echo "Found raster-eoapi pod: $EOAPI_POD_RASTER in namespace: $FOUND_NAMESPACE"
30+
break
31+
fi
32+
done
33+
34+
# Check if the pod was found
35+
if [ -z "$EOAPI_POD_RASTER" ]; then
36+
echo "Could not determine raster-eoapi pod."
37+
exit 1
38+
fi
39+
40+
# Check if input files exist
41+
for FILE in "$EOAPI_COLLECTIONS_FILE" "$EOAPI_ITEMS_FILE"; do
42+
if [ ! -f "$FILE" ]; then
43+
echo "File not found: $FILE. You may set them via the EOAPI_COLLECTIONS_FILE and EOAPI_ITEMS_FILE environment variables."
44+
exit 1
45+
fi
46+
done
47+
48+
# Install required packages
49+
echo "Installing required packages in pod $EOAPI_POD_RASTER in namespace $FOUND_NAMESPACE..."
50+
if ! kubectl exec -n "$FOUND_NAMESPACE" "$EOAPI_POD_RASTER" -- bash -c 'apt update -y && apt install python3 python3-pip -y && pip install pypgstac[psycopg]'; then
51+
echo "Failed to install packages."
52+
exit 1
53+
fi
54+
55+
# Copy files to pod
56+
echo "Copying files to pod..."
57+
echo "Using collections file: $EOAPI_COLLECTIONS_FILE"
58+
echo "Using items file: $EOAPI_ITEMS_FILE"
59+
kubectl cp "$EOAPI_COLLECTIONS_FILE" "$FOUND_NAMESPACE/$EOAPI_POD_RASTER":/tmp/collections.json
60+
kubectl cp "$EOAPI_ITEMS_FILE" "$FOUND_NAMESPACE/$EOAPI_POD_RASTER":/tmp/items.json
61+
62+
# Load collections and items
63+
echo "Loading collections..."
64+
if ! kubectl exec -n "$FOUND_NAMESPACE" "$EOAPI_POD_RASTER" -- bash -c 'pypgstac load collections /tmp/collections.json --dsn "$PGADMIN_URI" --method insert_ignore'; then
65+
echo "Failed to load collections."
66+
exit 1
67+
fi
68+
69+
echo "Loading items..."
70+
if ! kubectl exec -n "$FOUND_NAMESPACE" "$EOAPI_POD_RASTER" -- bash -c 'pypgstac load items /tmp/items.json --dsn "$PGADMIN_URI" --method insert_ignore'; then
71+
echo "Failed to load items."
72+
exit 1
73+
fi
74+
75+
# Clean temporary files
76+
echo "Cleaning temporary files..."
77+
kubectl exec -n "$FOUND_NAMESPACE" "$EOAPI_POD_RASTER" -- bash -c 'rm -f /tmp/collection.json /tmp/items.json'
78+
79+
echo "Ingestion complete."

0 commit comments

Comments
 (0)