Skip to content

Conversation

@thiyaguk09
Copy link
Contributor

@thiyaguk09 thiyaguk09 commented Oct 7, 2025

Description

This change introduces new error handling logic in util.makeRequest to intercept and transform common low-level network and transport-layer failures into a consistent, actionable standard Error object with custom properties.

The following raw errors are now intercepted:

  • ECONNRESET
  • ETIMEDOUT
  • Generic messages containing "timed out"
  • Generic messages containing "TLS handshake"

These errors are transformed into a standard Error object which is augmented with diagnostic information:

  • Message: "Request or TLS handshake timed out. This may be due to CPU starvation or a temporary network issue."

This prevents the propagation of ambiguous, underlying network library errors and provides developers with a clear, unified diagnostic message, especially when operating in environments with high CPU contention.

Impact

The impact of this change is primarily positive, improving the developer experience:

  • Improved Error Diagnostics: Developers no longer need to parse various cryptic network codes (ECONNRESET, etc.). They will receive a clear, actionable message about the likely cause (CPU starvation/TLS failure).
  • Consistent Error Handling: Facilitates easier integration with custom error retry and logging mechanisms by providing a predictable error structure rather than a raw, non-standard network error.
  • No Breaking Changes: This is a purely additive fix that catches and transforms errors that would have been thrown anyway. It does not alter the successful path for requests.

Testing

Yes, unit tests were added.

  • A dedicated unit test suite, Network Connectivity Errors, was created under makeAuthenticatedRequestFactory to validate the new transformation logic.
  • A dedicated, data-driven test suite, TLS handshake errors, was created to validate the new transformation logic across all four interception conditions.
  • A single test loop replaced four separate tests, covering all conditions:
    • should transform raw ECONNRESET into specific TLS/CPU starvation Error
    • should transform raw "TLS handshake" into specific TLS/CPU starvation Error
    • should transform raw generic "timed out" into specific TLS/CPU starvation Error
    • should transform raw ETIMEDOUT into specific TLS/CPU starvation Error

Tests Changed? No existing tests were modified.

Breaking Changes? No breaking changes are necessary.

Additional Information

  • Error Object Change: The transformation logic was simplified to augment a standard JavaScript Error object.
  • Test Structure: The tests were consolidated into a single forEach loop for improved clarity and maintainability.
  • Stubbing: The authClient was stubbed to guarantee successful authorization, forcing execution into the network path where the error injection and transformation occur, preventing test timeouts.

Checklist

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease
  • Appropriate docs were updated
  • Appropriate comments were added, particularly in complex areas or places that require background
  • No new warnings or issues will be generated from this change

Fixes #

Transforms raw network errors (ECONNRESET, ETIMEDOUT, timed out, and TLS
handshake) into a specific ApiError (code 408) with a descriptive
message regarding potential CPU starvation.

This prevents misleading error propagation from the underlying request
library.
@product-auto-label product-auto-label bot added size: m Pull request size is medium. api: storage Issues related to the googleapis/nodejs-storage API. labels Oct 7, 2025
@thiyaguk09 thiyaguk09 changed the title Feat/improve tls error handling fix: Transform network failures into specific TLS timeout ApiError Oct 7, 2025
@thiyaguk09 thiyaguk09 marked this pull request as ready for review October 9, 2025 04:40
@thiyaguk09 thiyaguk09 requested review from a team as code owners October 9, 2025 04:40
@ddelgrosso1 ddelgrosso1 added the owlbot:run Add this label to trigger the Owlbot post processor. label Oct 14, 2025
@gcf-owl-bot gcf-owl-bot bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label Oct 14, 2025
@ddelgrosso1
Copy link
Contributor

General comment but this logic will need to be ported to Gaxios in the future.

Splits network error handling: uses 408 for timeouts (timed out,
ETIMEDOUT, TLS handshake) and 503 for connection resets (ECONNRESET) to
improve retry logic accuracy.
let message: string;
if (err.message.toLowerCase().includes('econnreset')) {
// ECONNRESET (Connection reset by peer) implies temporary service unavailability
code = 503;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't be forcing these to have an HTTP error code. Things like ECONNRESET are usually indicative of an underlying TCP / socket level issue. They are not necessarily an error returned from the server.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an insightful point! You're right, TCP/socket errors like ECONNRESET are fundamental network issues and shouldn't be incorrectly treated as HTTP status codes. They reflect a connection failure, not necessarily an application-level server error.

I will check on it and come back.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I was using ApiError to throw errors, it was necessary to print the error code. I understand your point, so I have replaced ApiError with the Error function.

Converts raw ECONNRESET, ETIMEDOUT, and TLS handshake failures into a
standard Error object with an informative message. This helps diagnose
CPU starvation or misleading 401 errors.
Replaces repetitive test cases in `makeAuthenticatedRequest` and
`makeRequest` with a single, data-driven test loop. This verifies all
conditions (ECONNRESET, ETIMEDOUT, "timed out", "TLS handshake") with
reduced code duplication and improved maintenance.
```
@thiyaguk09 thiyaguk09 changed the title fix: Transform network failures into specific TLS timeout ApiError fix: Transform network failures into specific TLS timeout Oct 28, 2025
@thiyaguk09
Copy link
Contributor Author

This is a gentle reminder to please take a look when you have a moment.

@ddelgrosso1
Copy link
Contributor

I'm not really sure why we are forcing things such as ECONNRESET, ETIMEDOUT into a TLS error. They may or may not be related to TLS. I think this gives a false impression to the end user. I think we need to rethink what it is we are trying to accomplish here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: storage Issues related to the googleapis/nodejs-storage API. size: m Pull request size is medium.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants