-
-
Notifications
You must be signed in to change notification settings - Fork 17
Description
@cloudnativegeo/cng-editorial-board will review this submission.
Blog Post Title
The Technical Debt of Earth Embedding Products
Author(s)
Isaac Corley
Summary
This post examines why Earth embedding products — geospatial foundation models trained on massive satellite imagery — fail to integrate seamlessly despite individual technical excellence. Drawing from hands-on experience integrating seven embedding products (Clay, Major TOM, Earth Index, Copernicus-Embed, Presto, Tessera, AlphaEarth) into TorchGeo, it catalogs the real integration costs, storage math, and format fragmentation that downstream users face.
Why this post is relevant to Cloud Native Geo
The post directly advocates for cloud-native formats (GeoParquet, COG, GeoZarr) as the solution to the current fragmentation in Earth embedding distribution. It provides concrete storage and egress cost analysis showing why format and hosting choices matter at scale, and includes actionable checklists for both producers and consumers of embeddings. The themes — standardization, interoperability, and reducing friction for downstream users — are core to the CNG mission.
Timeline
Draft PR submitted here #115
- Draft submission date: 2026-02-28
- Final publication date: 2026-03-14 (I put 2 weeks out but soonest available is fine)
Anything else to share?
Cross post from my personal site and the companion paper: Earth Embeddings as Products: Taxonomy, Ecosystem, and Standardized Access.