Skip to content

feat(source-hubspot): remove api_budget and increase concurrency to 40/40 for rate limit stress testing#74900

Open
sophiecuiy wants to merge 3 commits intomasterfrom
devin/1773686109-hubspot-concurrency-stress-test
Open

feat(source-hubspot): remove api_budget and increase concurrency to 40/40 for rate limit stress testing#74900
sophiecuiy wants to merge 3 commits intomasterfrom
devin/1773686109-hubspot-concurrency-stress-test

Conversation

@sophiecuiy
Copy link
Contributor

@sophiecuiy sophiecuiy commented Mar 16, 2026

What

Temporary prerelease to stress-test HubSpot's rate limiting behavior by removing the client-side api_budget throttle and maximizing concurrency. The goal is to observe how syncs behave when hitting HubSpot's actual server-side rate limits (110 req/10s for OAuth apps).

This PR is not intended to be merged to master. It will be published as a prerelease via /publish-connectors-prerelease, producing a version tagged 6.3.1-preview.{sha}. Specific connections will be manually pinned to this prerelease version for testing.

How

  1. Commented out the entire api_budget section in manifest.yaml (lines 2329–2350) — removes the client-side MovingWindowCallRatePolicy that previously capped requests at 10 req/s / 100 req/10s (general) and 5 req/s (CRM Search). The original config is preserved as comments for easy restoration.
  2. Hardcoded default_concurrency: 40 (was "{{ config.get('num_workers', 10) }}") — this removes the user-configurable num_workers override and locks concurrency at the max.

No changes to metadata.yaml, changelog, or progressive rollout settings — prereleases are published from the PR branch without modifying the version on master.

Review guide

  1. airbyte-integrations/connectors/source-hubspot/manifest.yaml — the only file changed (api_budget commenting + concurrency change)

Key things to verify:

  • The default_concurrency change removes user configurability via the num_workers config option — this is intentional for the stress test but means all connections pinned to this prerelease will run at 40 concurrent workers regardless of user config
  • With no api_budget, the connector has zero client-side rate limiting — it relies entirely on HubSpot returning 429s and the CDK's retry/backoff handling those
  • This PR should not be merged — it exists only as a source for the prerelease publish

User Impact

Connections manually pinned to the prerelease version will:

  • Run at significantly higher concurrency (40 vs default 10)
  • Have no client-side rate throttling — requests go as fast as workers can send them
  • Likely trigger HubSpot 429 responses, testing the CDK's error handling and retry behavior
  • May experience longer sync times (due to retries) or sync failures if retry budgets are exhausted

No impact to unpinned connections — this prerelease will only affect connections that are explicitly pinned to it.

Can this PR be safely reverted and rolled back?

  • YES 💚

The original api_budget config is preserved as comments. Unpinning connections from the prerelease version restores the previous behavior. This PR is not intended to be merged.


Link to Devin session: https://app.devin.ai/sessions/fb11bb6801b6457b84645a812a6e3ab4
Requested by: sophiecuiy


Open with Devin

…0/40 for rate limit stress testing

Co-Authored-By: sophie.cui@airbyte.io <sophie.cui@airbyte.io>
@devin-ai-integration
Copy link
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@octavia-bot
Copy link
Contributor

octavia-bot bot commented Mar 16, 2026

Note

📝 PR Converted to Draft

More info...

Thank you for creating this PR. As a policy to protect our engineers' time, Airbyte requires all PRs to be created first in draft status. Your PR has been automatically converted to draft status in respect for this policy.

As soon as your PR is ready for formal review, you can proceed to convert the PR to "ready for review" status by clicking the "Ready for review" button at the bottom of the PR page.

To skip draft status in future PRs, please include [ready] in your PR title or add the skip-draft-status label when creating your PR.

@github-actions
Copy link
Contributor

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

💡 Show Tips and Tricks

PR Slash Commands

Airbyte Maintainers (that's you!) can execute the following slash commands on your PR:

  • 🛠️ Quick Fixes
    • /format-fix - Fixes most formatting issues.
    • /bump-version - Bumps connector versions, scraping changelog description from the PR title.
  • ❇️ AI Testing and Review (internal link: AI-SDLC Docs):
    • /ai-prove-fix - Runs prerelease readiness checks, including testing against customer connections.
    • /ai-canary-prerelease - Rolls out prerelease to 5-10 connections for canary testing.
    • /ai-review - AI-powered PR review for connector safety and quality gates.
  • 🚀 Connector Releases:
    • /publish-connectors-prerelease - Publishes pre-release connector builds (tagged as {version}-preview.{git-sha}) for all modified connectors in the PR.
    • /bump-progressive-rollout-version - Bumps connector version with an RC suffix (2.16.10-rc.1) for progressive rollouts (enableProgressiveRollout: true).
      • Example: /bump-progressive-rollout-version changelog="Add new feature for progressive rollout"
  • ☕️ JVM connectors:
    • /update-connector-cdk-version connector=<CONNECTOR_NAME> - Updates the specified connector to the latest CDK version.
      Example: /update-connector-cdk-version connector=destination-bigquery
  • 🐍 Python connectors:
    • /poe connector source-example lock - Run the Poe lock task on the source-example connector, committing the results back to the branch.
    • /poe source example lock - Alias for /poe connector source-example lock.
    • /poe source example use-cdk-branch my/branch - Pin the source-example CDK reference to the branch name specified.
    • /poe source example use-cdk-latest - Update the source-example CDK dependency to the latest available version.
  • ⚙️ Admin commands:
    • /force-merge reason="<REASON>" - Force merges the PR using admin privileges, bypassing CI checks. Requires a reason.
      Example: /force-merge reason="CI is flaky, tests pass locally"
📚 Show Repo Guidance

Helpful Resources

📝 Edit this welcome message.

@sophiecuiy sophiecuiy marked this pull request as ready for review March 16, 2026 18:42
@github-actions
Copy link
Contributor

github-actions bot commented Mar 16, 2026

Deploy preview for airbyte-docs ready!

✅ Preview
https://airbyte-docs-em4fy2vw8-airbyte-growth.vercel.app

Built with commit 9d6f087.
This pull request is being automatically deployed with vercel-action

devin-ai-integration bot and others added 2 commits March 16, 2026 18:50
…th manual pinning instead

Co-Authored-By: sophie.cui@airbyte.io <sophie.cui@airbyte.io>
…ease publish instead of RC

Co-Authored-By: sophie.cui@airbyte.io <sophie.cui@airbyte.io>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 16, 2026

Pre-release Connector Publish Started

Publishing pre-release build for connector source-hubspot.
PR: #74900

Pre-release versions will be tagged as {version}-preview.8e5fdeb
and are available for version pinning via the scoped_configuration API.

View workflow run
Pre-release Publish: SUCCESS

Docker image (pre-release):
airbyte/source-hubspot:6.3.1-preview.8e5fdeb

Docker Hub: https://hub.docker.com/layers/airbyte/source-hubspot/6.3.1-preview.8e5fdeb

Registry JSON:

Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

View 2 additional findings in Devin Review.

Open in Devin Review

Comment on lines +2329 to +2351
# api_budget temporarily removed to stress-test rate limiting behavior with high concurrency
# api_budget:
# type: HTTPAPIBudget
# policies:
# # 1) CRM Search: special cap, separate from the global burst
# - type: MovingWindowCallRatePolicy
# rates:
# - limit: 5 # 5 req/second
# interval: PT1S
# - limit: 300 # 300 req/min (same ceiling)
# interval: PT1M
# matchers:
# - method: POST
# url_path_pattern: "^/crm/v3/objects/[^/]+/search$"
# # 2) General: public app burst = 110 per 10s per installed account
# - type: MovingWindowCallRatePolicy
# rates:
# - limit: 10
# interval: PT1S
# - limit: 100
# interval: PT10S
# matchers: [ ]
# status_codes_for_ratelimit_hit: [429]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Client-side API rate limiting completely removed, will cause excessive 429 errors

The api_budget block has been commented out, removing all client-side rate limiting for HubSpot API calls. As documented in BEHAVIOR.md section 9 (airbyte-integrations/connectors/source-hubspot/BEHAVIOR.md:166-179), HubSpot enforces strict rate limits: CRM Search at 5 req/s and 300 req/min, general endpoints at 110 req/10s. Without client-side throttling, and combined with the concurrency increase to 40, the connector will aggressively exceed these limits, resulting in heavy 429 responses, exponential backoff cascades, and degraded sync performance. The comment says this is "temporarily removed to stress-test rate limiting behavior" — this is a testing change that should not be shipped to production.

Suggested change
# api_budget temporarily removed to stress-test rate limiting behavior with high concurrency
# api_budget:
# type: HTTPAPIBudget
# policies:
# # 1) CRM Search: special cap, separate from the global burst
# - type: MovingWindowCallRatePolicy
# rates:
# - limit: 5 # 5 req/second
# interval: PT1S
# - limit: 300 # 300 req/min (same ceiling)
# interval: PT1M
# matchers:
# - method: POST
# url_path_pattern: "^/crm/v3/objects/[^/]+/search$"
# # 2) General: public app burst = 110 per 10s per installed account
# - type: MovingWindowCallRatePolicy
# rates:
# - limit: 10
# interval: PT1S
# - limit: 100
# interval: PT10S
# matchers: [ ]
# status_codes_for_ratelimit_hit: [429]
api_budget:
type: HTTPAPIBudget
policies:
# 1) CRM Search: special cap, separate from the global burst
- type: MovingWindowCallRatePolicy
rates:
- limit: 5 # 5 req/second
interval: PT1S
- limit: 300 # 300 req/min (same ceiling)
interval: PT1M
matchers:
- method: POST
url_path_pattern: "^/crm/v3/objects/[^/]+/search$"
# 2) General: public app burst = 110 per 10s per installed account
- type: MovingWindowCallRatePolicy
rates:
- limit: 10
interval: PT1S
- limit: 100
interval: PT10S
matchers: [ ]
status_codes_for_ratelimit_hit: [429]
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both findings are intentional. This PR is a temporary prerelease for stress-testing rate limit behavior — it will not be merged to master. The api_budget removal and hardcoded concurrency of 40 are the whole point of the test. Connections will be manually pinned to the prerelease version (6.3.1-preview.8e5fdeb) for controlled testing only.

concurrency_level:
type: ConcurrencyLevel
default_concurrency: "{{ config.get('num_workers', 10) }}"
default_concurrency: 40
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Hardcoded concurrency of 40 ignores user-configured num_worker setting

The default_concurrency was changed from "{{ config.get('num_workers', 10) }}" to a hardcoded 40. The connector spec still exposes the num_worker config parameter (manifest.yaml:2281-2288) which lets users set concurrency between 1 and 40 (default 10), but this setting is now completely ignored — all users get maximum concurrency regardless of their choice. This breaks the user-facing contract of the num_worker configuration option.

Note: the old code also had a pre-existing key name mismatch

The old Jinja expression used config.get('num_workers', 10) (plural) but the config field is named num_worker (singular at manifest.yaml:2281), so the config lookup would always miss and fall back to 10. The fix should use the correct key: {{ config.get('num_worker', 10) }}.

Suggested change
default_concurrency: 40
default_concurrency: "{{ config.get('num_worker', 10) }}"
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intentional for this stress test prerelease — we want all pinned connections to run at max concurrency (40) to observe rate limiting behavior. This PR won't be merged to master, so the num_worker config contract is not affected for production users.

Good catch on the pre-existing num_workers vs num_worker key mismatch though — worth fixing separately in a future PR.

@github-actions
Copy link
Contributor

source-hubspot Connector Test Results

178 tests   166 ✅  19m 30s ⏱️
  2 suites   11 💤
  2 files      1 ❌

For more details on these failures, see this check.

Results for commit 8e5fdeb.

@sophiecuiy
Copy link
Contributor Author

proceed with pinning

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants