Skip to content

Conversation

@martinemde
Copy link
Member

@martinemde martinemde commented Nov 10, 2025

Add a new webhook server binary that:

  • Accepts POST requests to /webhook endpoint
  • Downloads RubyGems versions index from rubygems.org
  • Fetches allowlist from S3
  • Filters index using existing gem-index-filter library
  • Computes SHA-256 checksums
  • Uploads filtered results to S3 with timestamps
  • Maintains latest pointers for easy access
  • Handles graceful shutdown with background task tracking

Environment variables:

  • BUCKET_NAME: S3 bucket (default: rubygems-filtered)
  • ALLOWLIST_KEY: S3 key for allowlist (default: allowlist.txt)

Build with: cargo build --bin webhook-server --features server

Add a new webhook server binary that:
- Accepts POST requests to /webhook endpoint
- Downloads RubyGems versions index from rubygems.org
- Fetches allowlist from S3
- Filters index using existing gem-index-filter library
- Computes SHA-256 checksums
- Uploads filtered results to S3 with timestamps
- Maintains latest pointers for easy access
- Handles graceful shutdown with background task tracking

Technical details:
- Built with Axum 0.7 for async HTTP server
- Uses AWS SDK for S3 operations
- Integrates with existing streaming filter library
- Returns 202 Accepted for async processing
- Strips versions to minimize output size
- All dependencies are optional (server feature flag)

Configuration via environment variables:
- BUCKET_NAME: S3 bucket (default: rubygems-filtered)
- ALLOWLIST_KEY: S3 key for allowlist (default: allowlist.txt)

Build with: cargo build --bin webhook-server --features server
Replace separate .sha256 checksum files with S3's built-in checksum
support. This simplifies the S3 structure and provides better
integration with S3's native checksum verification features.

Changes:
- Store SHA-256 checksums using S3's checksum_sha256 field
- Remove separate .sha256 file uploads
- Remove latest checksum pointer (no longer needed)
- Add base64 dependency for checksum encoding
- Update documentation with examples of checksum retrieval

Benefits:
- Cleaner S3 bucket structure (fewer files)
- Native S3 checksum verification on download
- Checksums preserved during S3 copy operations
- Reduced API calls (1 PUT instead of 2)

The checksum is still computed during filtering and logged, but
now stored as S3 object metadata rather than a separate file.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants