Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Oct 25, 2025

The test intermittently fails with "context canceled" when the timeout is exceeded. The root cause is that the test doesn't properly handle context cancellation - when the shared context times out, subsequent operations fail with "context canceled" errors that are treated as test failures even though they're expected behavior.

Changes

  • Added context cancellation checks (if ctx.Err() == nil) before all require.NoError(t, err) calls for operations using the shared context
  • When the context is canceled, error assertions are skipped since the errors are expected
  • Applies to all sections: table creation, data upsert, transactions, and query operations
// Before
require.NoError(t, err)

// After
if ctx.Err() == nil {
    require.NoError(t, err)
}

This follows the pattern used in other integration tests like database_sql_regression_test.go and properly handles the scenario where operations fail due to context timeout.

Testing

Original prompt

This section details on the original issue you should resolve

<issue_title>bug: flaky-test TestBasicExampleQuery</issue_title>
<issue_description>https://github.com/ydb-platform/ydb-go-sdk/actions/runs/18799671181/job/53645314765

--- FAIL: TestBasicExampleQuery (0.25s)
--- FAIL: TestBasicExampleQuery/ExecuteDataQuery (0.00s)
basic_example_query_test.go:379:
Error Trace: /home/runner/work/ydb-go-sdk/ydb-go-sdk/tests/integration/basic_example_query_test.go:379
Error: Received unexpected error:
pool.With failed with 1 attempts: non-retryable error occurred on attempt No.1 (idempotent=false): context canceled at github.com/ydb-platform/ydb-go-sdk/v3/internal/query.newResult(result.go:157) at github.com/ydb-platform/ydb-go-sdk/v3/internal/query.execute(execute_query.go:142) at github.com/ydb-platform/ydb-go-sdk/v3/internal/query.(*Session).execute(session.go:157) at github.com/ydb-platform/ydb-go-sdk/v3/internal/query.(*Session).queryRow(session.go:61) at github.com/ydb-platform/ydb-go-sdk/v3/internal/query.clientQueryRow.func1(client.go:313) at github.com/ydb-platform/ydb-go-sdk/v3/internal/query.do.func1(client.go:223) at github.com/ydb-platform/ydb-go-sdk/v3/internal/pool.(*Pool).try(pool.go:465) at github.com/ydb-platform/ydb-go-sdk/v3/internal/pool.(*Pool).With.func2(pool.go:493) at github.com/ydb-platform/ydb-go-sdk/v3/retry.Retry.func1(retry.go:264) at github.com/ydb-platform/ydb-go-sdk/v3/retry.opWithRecover(retry.go:418) at github.com/ydb-platform/ydb-go-sdk/v3/retry.RetryWithResult(retry.go:358) at github.com/ydb-platform/ydb-go-sdk/v3/retry.Retry(retry.go:270) at github.com/ydb-platform/ydb-go-sdk/v3/internal/pool.(*Pool).With(pool.go:499) at github.com/ydb-platform/ydb-go-sdk/v3/internal/query.do(client.go:231) at github.com/ydb-platform/ydb-go-sdk/v3/internal/query.clientQueryRow(client.go:319) at github.com/ydb-platform/ydb-go-sdk/v3/internal/query.(*Client).QueryRow(client.go:342)
Test: TestBasicExampleQuery/ExecuteDataQuery
basic_example_query_test.go:47: total consumed units: 64
mutex.go:15:

[2025-10-25 06:53:36] final registry state:

  • gauges:
    • ydb.table.pool.wait{} = 0.000000
    • ydb.info{version=3.117.1} = 1.000000
    • ydb.driver.balancer.endpoints{az=1} = 1.000000
    • ydb.driver.conns{endpoint=localhost:2136,node_id=1} = 0.000000
    • ydb.table.pool.index{} = 0.000000
    • ydb.table.pool.createInProgress{} = 0.000000
    • ydb.table.sessions{node_id=1} = 0.000000
    • ydb.table.pool.limit{} = 0.000000
    • ydb.table.pool.idle{} = 0.000000
    • ydb.table.pool.with{} = 0.000000
  • counters:
    • ydb.driver.balancer.updates{cause=init} = 1
    • ydb.driver.conn.request_methods{endpoint=localhost:2136,method=/Ydb.Table.V1.TableService/BeginTransaction,node_id=1} = 1
    • ydb.driver.conn.request_methods{endpoint=localhost:2136,method=/Ydb.Table.V1.TableService/DeleteSession,node_id=1} = 1
    • ydb.driver.conn.request_methods{endpoint=localhost:2136,method=/Ydb.Table.V1.TableService/StreamExecuteScanQuery,node_id=1} = 1
    • ydb.driver.conn.request_methods{endpoint=localhost:2136,method=/Ydb.Scheme.V1.SchemeService/ListDirectory,node_id=1} = 1
    • ydb.driver.conn.request_methods{endpoint=localhost:2136,method=/Ydb.Scheme.V1.SchemeService/MakeDirectory,node_id=1} = 1
    • ydb.driver.conn.request_methods{endpoint=localhost:2136,method=/Ydb.Table.V1.TableService/PrepareDataQuery,node_id=1} = 1
    • ydb.driver.conn.request_methods{endpoint=localhost:2136,method=/Ydb.Table.V1.TableService/CommitTransaction,node_id=1} = 1
    • ydb.driver.conn.request_methods{endpoint=localhost:2136,method=/Ydb.Table.V1.TableService/StreamReadTable,node_id=1} = 1
    • ydb.driver.conn.request_methods{endpoint=localhost:2136,method=/Ydb.Table.V1.TableService/CreateSession,node_id=1} = 2
    • ydb.driver.conn.request_methods{endpoint=localhost:2136,method=/Ydb.Scheme.V1.SchemeService/DescribePath,node_id=1} = 1
    • ydb.driver.conn.request_methods{endpoint=localhost:2136,method=/Ydb.Table.V1.TableService/DescribeTable,node_id=1} = 6
    • ydb.driver.conn.request_methods{endpoint=localhost:2136,method=/Ydb.Table.V1.TableService/CreateTable,node_id=1} = 3
    • ydb.driver.conn.request_methods{endpoint=localhost:2136,method=/Ydb.Table.V1.TableService/ExecuteDataQuery,node_id=1} = 5
    • ydb.table.session.tx.errs{status=OK} = 0
    • ydb.driver.balancer.discoveries{cause=init,status=OK} = 2
    • ydb.driver.conn.request_statuses{endpoint=localhost:2136,node_id=1,status=operation/SCHEME_ERROR} = 3
    • ydb.driver.conn.request_statuses{endpoint=localhost:2136,node_id=1,status=OK} = 22
    • ydb.table.pool.get{} = 13
    • ydb.table.pool.put{} = 12
    • ydb.table.session.query.errs{status=OK} = 0
  • timers:
    • ydb.table.session.tx.latency{} = 27.172755ms
    • ydb.table.session.query.latency{} = 16.27153ms
  • histograms:
    • ydb.table.session.query.attempts{} = [[0..1]:12,[1..2]:0,[2..3]:0,[3..4]:0,[4..5]:0,[5..7]...

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

@github-actions
Copy link

summary

Base version: v3.117.2-0.20251024101740-fda5a9d2c204 (master)
Cannot suggest a release version.
Can only suggest a release version when compared against the most recent version of this major: v3.117.1.

The test was failing intermittently with "context canceled" error because
the 42-second timeout was occasionally exceeded during comprehensive test
operations including DB setup, table creation, data insertion, and queries.

Increasing to 90 seconds provides adequate buffer for slower CI environments
while still catching genuine hangs.

Co-authored-by: asmyasnikov <[email protected]>
Copilot AI changed the title [WIP] Fix flaky test in TestBasicExampleQuery Fix flaky TestBasicExampleQuery by increasing context timeout Oct 25, 2025
Copilot finished work on behalf of asmyasnikov October 25, 2025 09:24
Copilot AI requested a review from asmyasnikov October 25, 2025 09:24
@codecov-commenter
Copy link

codecov-commenter commented Oct 25, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.92%. Comparing base (fda5a9d) to head (34b494a).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1907      +/-   ##
==========================================
- Coverage   73.98%   73.92%   -0.07%     
==========================================
  Files         390      390              
  Lines       34154    34154              
==========================================
- Hits        25270    25248      -22     
- Misses       7765     7781      +16     
- Partials     1119     1125       +6     
Flag Coverage Δ
experiment 73.57% <ø> (-0.11%) ⬇️
go-1.21.x 72.20% <ø> (-0.08%) ⬇️
go-1.25.x 73.89% <ø> (-0.09%) ⬇️
integration 55.29% <ø> (-0.08%) ⬇️
macOS 47.16% <ø> (-0.02%) ⬇️
ubuntu 73.92% <ø> (-0.07%) ⬇️
unit 47.17% <ø> (-0.01%) ⬇️
windows 47.15% <ø> (-0.03%) ⬇️
ydb-24.4 54.29% <ø> (-0.06%) ⬇️
ydb-25.2 55.15% <ø> (+0.12%) ⬆️
ydb-latest 54.61% <ø> (-0.36%) ⬇️
ydb-nightly 73.57% <ø> (-0.11%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@asmyasnikov
Copy link
Member

@copilot every checks require.NoError(t, err) (for example

) need to check root context is cancelled. If root context is cancelled - it is normal that got err on some stage of test
Increasing total timeout not fixed problem with flaky test

Instead of just increasing timeout, properly handle context cancellation.
When the root context is canceled, it's expected that operations fail,
so we check ctx.Err() before calling require.NoError().

This follows the pattern used in other integration tests like
database_sql_regression_test.go.

Co-authored-by: asmyasnikov <[email protected]>
Copy link
Contributor Author

Copilot AI commented Oct 25, 2025

@copilot every checks require.NoError(t, err) (for example

)...

Fixed in commit 34b494a. Reverted the timeout to 42 seconds and added context cancellation checks before all require.NoError calls. The test now skips error assertions when ctx.Err() != nil, following the pattern from database_sql_regression_test.go.

Copilot AI changed the title Fix flaky TestBasicExampleQuery by increasing context timeout Fix flaky TestBasicExampleQuery by handling context cancellation Oct 25, 2025
Copilot finished work on behalf of asmyasnikov October 25, 2025 10:41
@asmyasnikov asmyasnikov requested a review from Copilot October 25, 2025 10:42
@asmyasnikov asmyasnikov marked this pull request as ready for review October 25, 2025 10:42
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a flaky integration test by properly handling context cancellation scenarios. When the test's shared context times out, subsequent operations fail with expected "context canceled" errors that were being incorrectly treated as test failures.

Key Changes:

  • Added conditional error assertions that skip validation when the context is canceled
  • Applied the pattern if ctx.Err() == nil { require.NoError(t, err) } consistently across all test sections

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +351 to +356
if ctx.Err() == nil {
require.NoError(t, err)
var views uint64
require.NoError(t, row.Scan(&views))
require.EqualValues(t, 1, views)
}
Copy link

Copilot AI Oct 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertions at lines 354-355 should also be guarded against context cancellation. If ctx.Err() == nil passes but the context is canceled between line 352 and 354, row.Scan(&views) could fail with a context error. Consider checking ctx.Err() before each assertion that could be affected by context state, or move all related assertions inside a single guard.

Copilot uses AI. Check for mistakes.
Comment on lines +391 to +403
if ctx.Err() == nil {
require.NoError(t, err)
var (
id *uint64
title *string
date *time.Time
)
require.NoError(t, row.Scan(&id, &title, &date))
t.Logf(
" > %d %s %s\n",
*id, *title, *date,
)
}
Copy link

Copilot AI Oct 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to Comment 1, the row.Scan() assertion at line 398 is not guarded against context cancellation that could occur after line 391. The same race condition exists where the context could be canceled between the initial check and the scan operation. Consider re-checking ctx.Err() before line 398 or restructuring the logic to handle mid-execution cancellation.

Copilot uses AI. Check for mistakes.
@asmyasnikov asmyasnikov merged commit 51e67c3 into master Oct 25, 2025
33 of 53 checks passed
@asmyasnikov asmyasnikov deleted the copilot/fix-flaky-test-basic-example-query branch October 25, 2025 14:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: flaky-test TestBasicExampleQuery

3 participants