Skip to content

feat: add feed availability task and scheduler#1705

Open
davidgamez wants to merge 16 commits into
mainfrom
feed_availability
Open

feat: add feed availability task and scheduler#1705
davidgamez wants to merge 16 commits into
mainfrom
feed_availability

Conversation

@davidgamez
Copy link
Copy Markdown
Member

@davidgamez davidgamez commented May 19, 2026

Summary:

This PR adds the cloud function and scheduler to check for and persist GTFS feed availability. It expanded the scope of the issue to perform a quick zip content check. Initially, it was intended to perform only HEAD HTTP requests, but testing in DEV, I realized that some of the servers don't support HEAD requests(~160). As a workaround, the check executes a HEAD request and if it fails, a GET request.

From our AI friend

This pull request introduces a new system for checking the availability of GTFS feeds via HTTP HEAD (with GET fallback), refactors and modularizes HTTP and SSL utilities, and documents the new feed availability check task. The changes improve reliability, security, and observability of feed health checks, and make the codebase more maintainable.

New GTFS Feed Availability Check System:

  • Added a new task, check_gtfs_feed_availability, to the task executor, which checks the availability of published GTFS feeds using HTTP HEAD requests, with an optional GET fallback that reads only the ZIP magic bytes. Results are stored in the gtfs_feed_availability_check table, and the task is fully documented in README.md with parameters and sample responses. [1] [2] [3]

HTTP/SSL Utility Refactoring and Enhancements:

  • Refactored functions-python/helpers/utils.py to modularize HTTP and SSL logic:
    • Extracted SSL context creation to create_feed_ssl_context, supporting legacy server compatibility and optional certificate validation disabling.
    • Added build_feed_request_params for constructing headers and URLs with flexible authentication.
    • Introduced internal helpers for content type parsing, ZIP detection, redirect logging, and robust HTTP request execution with error handling.
    • Implemented perform_request for performing availability checks with fallback and result normalization.
    • Updated download_and_get_hash to use the new modular utilities for improved clarity and maintainability. [1] [2] [3]

Documentation Improvements:

  • Expanded README.md to document the new feed availability check task, including configuration parameters, usage examples, and sample output for both normal and verbose modes.

Expected behavior:

The GTFS availability is persisted in the DB.

Testing tips:

Internal team: Can be tested via retool in the dev environment

Please make sure these boxes are checked before submitting your pull request - thanks!

  • Run the unit tests with ./scripts/api-tests.sh to make sure you didn't break anything
  • Add or update any needed documentation to the repo
  • Format the title like "feat: [new feature short description]". Title must follow the Conventional Commit Specification(https://www.conventionalcommits.org/en/v1.0.0/).
  • Linked all relevant issues
  • Include screenshot(s) showing how this pull request works and fixes the issue(s)

@davidgamez davidgamez linked an issue May 21, 2026 that may be closed by this pull request
@davidgamez
Copy link
Copy Markdown
Member Author

Execution screenshot in DEV
Screenshot 2026-05-21 at 3 11 54 PM

@davidgamez davidgamez marked this pull request as ready for review May 21, 2026 19:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Create scheduled Cloud Function to check GTFS feed availability

1 participant