Skip to content

[GH-2657] Upgrade proj4sedona to 0.0.4 and adopt UrlCRSProvider#2658

Merged
jiayuasu merged 6 commits intomasterfrom
fix/upgrade-proj4sedona-url-crs-provider-2657
Feb 18, 2026
Merged

[GH-2657] Upgrade proj4sedona to 0.0.4 and adopt UrlCRSProvider#2658
jiayuasu merged 6 commits intomasterfrom
fix/upgrade-proj4sedona-url-crs-provider-2657

Conversation

@jiayuasu
Copy link
Member

@jiayuasu jiayuasu commented Feb 18, 2026

Did you read the Contributor Guide?

Is this PR related to a ticket?

What changes were proposed in this PR?

Upgrade proj4sedona from 0.0.3 to 0.0.4 and adopt the new UrlCRSProvider API, allowing users to resolve CRS definitions from a remote HTTP server (e.g., a GitHub repo or S3 bucket) before falling back to built-in definitions.

Changes

Dependency upgrade

  • pom.xml: bump proj4sedona.version from 0.0.3 to 0.0.4

New Spark configuration keys (in SedonaConf.java)

  • spark.sedona.crs.url.base (default: empty string, disabled) — Base URL of the CRS definition server
  • spark.sedona.crs.url.pathTemplate (default: /{authority}/{code}.json) — URL path template with {authority} and {code} placeholders
  • spark.sedona.crs.url.format (default: projjson) — Response format: projjson, proj, wkt1, or wkt2

Registration logic (in FunctionsProj4.java)

  • registerUrlCrsProvider(baseUrl, pathTemplate, format): registers a UrlCRSProvider with proj4sedona Defs registry at priority 50 (before built-in at 100)
  • Thread-safe via double-checked locking: fast path is lock-free (volatile read + String.equals), synchronized slow path executes at most once per JVM
  • parseCrsFormat(String): maps config string to CRSResult.Format enum

ST_Transform integration (in Functions.scala)

  • ST_Transform captures the 3 new config values on the driver via SedonaConf and serializes them to executors
  • Registration happens inside lazy val f on executors during row evaluation
  • Companion object readConfig() consolidates all config reading

Documentation (in CRS-Transformation.md and Parameter.md)

  • New "URL CRS Provider" section with hosting guidance (GitHub repo, S3 bucket)
  • Examples: GitHub raw URL, self-hosted server, custom authority codes (MYORG:1001), geometry SRID usage
  • Config parameter reference in Parameter.md

How was this patch tested?

Unit tests (FunctionsProj4Test.java — 42 tests)

  • testRegisterUrlCrsProviderNoOpOnNullOrEmpty: null/empty baseUrl is a no-op
  • testRegisterUrlCrsProviderRegistersAndIsIdempotent: single registration, no duplicates on repeat call
  • testRegisterUrlCrsProviderReRegistersOnConfigChange: config change triggers re-registration
  • testParseCrsFormatAllMappings: all format strings map correctly
  • testParseCrsFormatDefaultsAndCaseInsensitive: null/empty/unknown/uppercase default to PROJJSON
  • testTransformWithLocalUrlCrsProvider: local HTTP server serves fake EPSG:990001, verifies URL provider resolves custom code
  • testRegisterUrlCrsProviderConcurrentThreadSafety: 16 threads race into registration via CyclicBarrier, asserts exactly 1 provider registered

Integration tests (CRSTransformProj4Test.scala — 36 tests, 4 new)

  • should still transform correctly when URL provider is not configured
  • should fall back to built-in when URL provider returns nothing
  • should register URL CRS provider when config is set
  • should transform using local HTTP URL CRS provider with custom CRS

Config tests (SedonaConfTest.java — 9 tests, 6 new)

  • Default values, custom overrides, empty string handling for all 3 config keys

Run commands:

mvn test -pl common -Dtest=FunctionsProj4Test
mvn test -pl spark/common -Dlog4j.version=2.19.0 -Dtest=SedonaConfTest
mvn scalatest:test -pl spark/common -Dlog4j.version=2.19.0 -Dsuites=org.apache.sedona.sql.CRSTransformProj4Test

Did this PR include necessary documentation updates?

  • Yes, I have updated the documentation.

- Bump proj4sedona.version from 0.0.3 to 0.0.4
- Add 3 new Spark configs: spark.sedona.crs.url.base,
  spark.sedona.crs.url.pathTemplate, spark.sedona.crs.url.format
- Add registerUrlCrsProvider() in FunctionsProj4 with thread-safe
  idempotent registration (AtomicReference, priority 50)
- Wire ST_Transform to capture URL CRS config on driver and register
  provider on executors via companion object readConfig()
- Add tests: 6 unit tests (FunctionsProj4Test), 6 config tests
  (SedonaConfTest), 4 integration tests (CRSTransformProj4Test)
  using local HTTP server with fake EPSG:990001
1. Move URL CRS provider registration from ST_Transform class body into
   lazy val f, so it only executes on executors during row evaluation,
   never on the driver during query planning.

2. Wrap registerUrlCrsProvider's remove-register-set sequence in a
   synchronized block with double-checked locking. The fast path
   (already registered) is lock-free (volatile read + String.equals).

3. Add 16-thread concurrency test verifying no duplicate providers
   are registered under contention.
- Parameter.md: document spark.sedona.crs.url.base, pathTemplate, format
- CRS-Transformation.md: add URL CRS Provider section with hosting
  guidance (GitHub repo, S3), supported formats table, GitHub raw URL
  example, self-hosted server example, custom authority codes example
  (MYORG:1001), and instructions for disabling
- All examples use Python SedonaContext.builder().config() style
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request upgrades proj4sedona from version 0.0.3 to 0.0.4 and introduces a new URL-based CRS Provider feature that enables users to resolve custom CRS definitions from remote HTTP servers (such as GitHub repositories or S3 buckets) before falling back to built-in definitions. The feature is designed to address use cases where users need custom or internal coordinate reference system definitions not included in standard CRS databases, particularly relevant for specialized transformations as described in issue #1397.

Changes:

  • Upgraded proj4sedona dependency from 0.0.3 to 0.0.4 in pom.xml
  • Added three new Spark configuration parameters for URL-based CRS resolution: spark.sedona.crs.url.base, spark.sedona.crs.url.pathTemplate, and spark.sedona.crs.url.format
  • Implemented thread-safe URL CRS provider registration logic in FunctionsProj4.java using double-checked locking pattern
  • Integrated URL provider registration into ST_Transform expression with lazy evaluation on executors
  • Added comprehensive test coverage including unit tests, integration tests, and concurrency tests
  • Documented the new feature with multiple usage examples in CRS-Transformation.md and Parameter.md

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
pom.xml Bumped proj4sedona version from 0.0.3 to 0.0.4
spark/common/src/main/java/org/apache/sedona/core/utils/SedonaConf.java Added three new configuration fields and getter methods for URL CRS provider settings
spark/common/src/test/java/org/apache/sedona/core/utils/SedonaConfTest.java Added 6 new unit tests validating default values and custom configurations for URL CRS provider settings
common/src/main/java/org/apache/sedona/common/FunctionsProj4.java Implemented thread-safe registerUrlCrsProvider() method with double-checked locking and parseCrsFormat() helper
common/src/test/java/org/apache/sedona/common/FunctionsProj4Test.java Added 7 comprehensive unit tests including thread safety, idempotency, config changes, format parsing, and local HTTP server integration
spark/common/src/main/scala/org/apache/spark/sql/sedona_sql/expressions/Functions.scala Modified ST_Transform to capture and serialize URL provider config from driver to executors, with lazy registration in executor evaluation
spark/common/src/test/scala/org/apache/sedona/sql/CRSTransformProj4Test.scala Added 4 integration tests covering default behavior, fallback scenarios, provider registration verification, and end-to-end custom CRS transformation
docs/api/sql/Parameter.md Documented the three new configuration parameters with descriptions, defaults, examples, and supported values
docs/api/sql/CRS-Transformation.md Added comprehensive "URL CRS Provider" section with hosting guidance, configuration instructions, format table, and 5 practical examples

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@jiayuasu jiayuasu merged commit e2db567 into master Feb 18, 2026
48 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Upgrade proj4sedona to 0.0.4 version and adopt the UrlCRSprovider

1 participant

Comments