feat(source-hubspot): remove api_budget and increase concurrency to 40/40 for rate limit stress testing#74900
Conversation
…0/40 for rate limit stress testing Co-Authored-By: sophie.cui@airbyte.io <sophie.cui@airbyte.io>
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
|
Note 📝 PR Converted to Draft More info...Thank you for creating this PR. As a policy to protect our engineers' time, Airbyte requires all PRs to be created first in draft status. Your PR has been automatically converted to draft status in respect for this policy. As soon as your PR is ready for formal review, you can proceed to convert the PR to "ready for review" status by clicking the "Ready for review" button at the bottom of the PR page. To skip draft status in future PRs, please include |
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. 💡 Show Tips and TricksPR Slash CommandsAirbyte Maintainers (that's you!) can execute the following slash commands on your PR:
📚 Show Repo GuidanceHelpful Resources
|
|
Deploy preview for airbyte-docs ready! ✅ Preview Built with commit 9d6f087. |
…th manual pinning instead Co-Authored-By: sophie.cui@airbyte.io <sophie.cui@airbyte.io>
…ease publish instead of RC Co-Authored-By: sophie.cui@airbyte.io <sophie.cui@airbyte.io>
|
| # api_budget temporarily removed to stress-test rate limiting behavior with high concurrency | ||
| # api_budget: | ||
| # type: HTTPAPIBudget | ||
| # policies: | ||
| # # 1) CRM Search: special cap, separate from the global burst | ||
| # - type: MovingWindowCallRatePolicy | ||
| # rates: | ||
| # - limit: 5 # 5 req/second | ||
| # interval: PT1S | ||
| # - limit: 300 # 300 req/min (same ceiling) | ||
| # interval: PT1M | ||
| # matchers: | ||
| # - method: POST | ||
| # url_path_pattern: "^/crm/v3/objects/[^/]+/search$" | ||
| # # 2) General: public app burst = 110 per 10s per installed account | ||
| # - type: MovingWindowCallRatePolicy | ||
| # rates: | ||
| # - limit: 10 | ||
| # interval: PT1S | ||
| # - limit: 100 | ||
| # interval: PT10S | ||
| # matchers: [ ] | ||
| # status_codes_for_ratelimit_hit: [429] |
There was a problem hiding this comment.
🔴 Client-side API rate limiting completely removed, will cause excessive 429 errors
The api_budget block has been commented out, removing all client-side rate limiting for HubSpot API calls. As documented in BEHAVIOR.md section 9 (airbyte-integrations/connectors/source-hubspot/BEHAVIOR.md:166-179), HubSpot enforces strict rate limits: CRM Search at 5 req/s and 300 req/min, general endpoints at 110 req/10s. Without client-side throttling, and combined with the concurrency increase to 40, the connector will aggressively exceed these limits, resulting in heavy 429 responses, exponential backoff cascades, and degraded sync performance. The comment says this is "temporarily removed to stress-test rate limiting behavior" — this is a testing change that should not be shipped to production.
| # api_budget temporarily removed to stress-test rate limiting behavior with high concurrency | |
| # api_budget: | |
| # type: HTTPAPIBudget | |
| # policies: | |
| # # 1) CRM Search: special cap, separate from the global burst | |
| # - type: MovingWindowCallRatePolicy | |
| # rates: | |
| # - limit: 5 # 5 req/second | |
| # interval: PT1S | |
| # - limit: 300 # 300 req/min (same ceiling) | |
| # interval: PT1M | |
| # matchers: | |
| # - method: POST | |
| # url_path_pattern: "^/crm/v3/objects/[^/]+/search$" | |
| # # 2) General: public app burst = 110 per 10s per installed account | |
| # - type: MovingWindowCallRatePolicy | |
| # rates: | |
| # - limit: 10 | |
| # interval: PT1S | |
| # - limit: 100 | |
| # interval: PT10S | |
| # matchers: [ ] | |
| # status_codes_for_ratelimit_hit: [429] | |
| api_budget: | |
| type: HTTPAPIBudget | |
| policies: | |
| # 1) CRM Search: special cap, separate from the global burst | |
| - type: MovingWindowCallRatePolicy | |
| rates: | |
| - limit: 5 # 5 req/second | |
| interval: PT1S | |
| - limit: 300 # 300 req/min (same ceiling) | |
| interval: PT1M | |
| matchers: | |
| - method: POST | |
| url_path_pattern: "^/crm/v3/objects/[^/]+/search$" | |
| # 2) General: public app burst = 110 per 10s per installed account | |
| - type: MovingWindowCallRatePolicy | |
| rates: | |
| - limit: 10 | |
| interval: PT1S | |
| - limit: 100 | |
| interval: PT10S | |
| matchers: [ ] | |
| status_codes_for_ratelimit_hit: [429] | |
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
Both findings are intentional. This PR is a temporary prerelease for stress-testing rate limit behavior — it will not be merged to master. The api_budget removal and hardcoded concurrency of 40 are the whole point of the test. Connections will be manually pinned to the prerelease version (6.3.1-preview.8e5fdeb) for controlled testing only.
| concurrency_level: | ||
| type: ConcurrencyLevel | ||
| default_concurrency: "{{ config.get('num_workers', 10) }}" | ||
| default_concurrency: 40 |
There was a problem hiding this comment.
🔴 Hardcoded concurrency of 40 ignores user-configured num_worker setting
The default_concurrency was changed from "{{ config.get('num_workers', 10) }}" to a hardcoded 40. The connector spec still exposes the num_worker config parameter (manifest.yaml:2281-2288) which lets users set concurrency between 1 and 40 (default 10), but this setting is now completely ignored — all users get maximum concurrency regardless of their choice. This breaks the user-facing contract of the num_worker configuration option.
Note: the old code also had a pre-existing key name mismatch
The old Jinja expression used config.get('num_workers', 10) (plural) but the config field is named num_worker (singular at manifest.yaml:2281), so the config lookup would always miss and fall back to 10. The fix should use the correct key: {{ config.get('num_worker', 10) }}.
| default_concurrency: 40 | |
| default_concurrency: "{{ config.get('num_worker', 10) }}" |
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
Intentional for this stress test prerelease — we want all pinned connections to run at max concurrency (40) to observe rate limiting behavior. This PR won't be merged to master, so the num_worker config contract is not affected for production users.
Good catch on the pre-existing num_workers vs num_worker key mismatch though — worth fixing separately in a future PR.
|
|
proceed with pinning |
What
Temporary prerelease to stress-test HubSpot's rate limiting behavior by removing the client-side
api_budgetthrottle and maximizing concurrency. The goal is to observe how syncs behave when hitting HubSpot's actual server-side rate limits (110 req/10s for OAuth apps).This PR is not intended to be merged to master. It will be published as a prerelease via
/publish-connectors-prerelease, producing a version tagged6.3.1-preview.{sha}. Specific connections will be manually pinned to this prerelease version for testing.How
api_budgetsection inmanifest.yaml(lines 2329–2350) — removes the client-side MovingWindowCallRatePolicy that previously capped requests at 10 req/s / 100 req/10s (general) and 5 req/s (CRM Search). The original config is preserved as comments for easy restoration.default_concurrency: 40(was"{{ config.get('num_workers', 10) }}") — this removes the user-configurablenum_workersoverride and locks concurrency at the max.No changes to
metadata.yaml, changelog, or progressive rollout settings — prereleases are published from the PR branch without modifying the version on master.Review guide
airbyte-integrations/connectors/source-hubspot/manifest.yaml— the only file changed (api_budget commenting + concurrency change)Key things to verify:
default_concurrencychange removes user configurability via thenum_workersconfig option — this is intentional for the stress test but means all connections pinned to this prerelease will run at 40 concurrent workers regardless of user configapi_budget, the connector has zero client-side rate limiting — it relies entirely on HubSpot returning 429s and the CDK's retry/backoff handling thoseUser Impact
Connections manually pinned to the prerelease version will:
No impact to unpinned connections — this prerelease will only affect connections that are explicitly pinned to it.
Can this PR be safely reverted and rolled back?
The original
api_budgetconfig is preserved as comments. Unpinning connections from the prerelease version restores the previous behavior. This PR is not intended to be merged.Link to Devin session: https://app.devin.ai/sessions/fb11bb6801b6457b84645a812a6e3ab4
Requested by: sophiecuiy