Skip to content

Conversation

@lilyLuLiu
Copy link
Contributor

@lilyLuLiu lilyLuLiu commented Oct 22, 2025

Description

Fixes: #N

Relates to: #N, PR #N, ...

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • Feature (non-breaking change which adds functionality)
  • [] Breaking change (fix or feature that would cause existing functionality to change
  • Chore (non-breaking change which doesn't affect codebase;
    test, version modification, documentation, etc.)

Proposed changes

Testing

Contribution Checklist

  • I Keep It Small and Simple: The smaller the PR is, the easier it is to review and have it merged
  • I have performed a self-review of my code
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Which platform have you tested the code changes on?
    • Linux
    • Windows
    • MacOS

Summary by CodeRabbit

  • Tests

    • Added performance tags to end-to-end scenarios.
    • Removed CPU data checks while keeping memory checks in flows.
    • Record precise timestamps at deployment, stop, and restart points for improved timing analysis.
  • Chores

    • Introduced a new background monitoring and timestamping mechanism to support performance runs and result logging.

@openshift-ci openshift-ci bot requested review from adrianriobo and albfan October 22, 2025 08:04
@coderabbitai
Copy link

coderabbitai bot commented Oct 22, 2025

Walkthrough

Replaces per-step CPU checks with a background Monitor that samples CPU periodically, adds timestamp recording steps in feature scenarios, tags scenarios with @performance, and updates test hooks to start/stop the Monitor while preserving memory data checks.

Changes

Cohort / File(s) Summary
Feature file updates
test/e2e/features/story_microshift.feature, test/e2e/features/story_openshift.feature
Removed per-step CPU data checks, retained memory checks, added @performance tag, and inserted record timestamp steps around deployment, stop, and restart points.
Test bindings & hooks
test/e2e/testsuite/testsuite.go
Removed CPU step binding and getCPUdata(); added getTimestamp() step to record timestamps to test-results/time-stamp.txt; initialize Monitor in scenario setup and start/stop it in @performance hooks with a 1s sampling interval.
Performance monitor implementation
test/e2e/testsuite/performance.go
Added exported Monitor type with NewMonitor(interval), Start(), Stop() and internal collectLoop() that periodically collects CPU usage via gopsutil and appends timestamped samples to test-results/cpu-consume.txt.

Sequence Diagram(s)

sequenceDiagram
    participant Feature
    participant Hooks
    participant Monitor
    participant FS as "File System"

    Note over Hooks,Monitor `#DDEBF7`: `@performance` scenario lifecycle

    Feature->>Hooks: Scenario with `@performance` begins
    Hooks->>Monitor: Start()
    Monitor->>Monitor: spawn collectLoop (periodic CPU sample)
    Monitor->>FS: append cpu sample -> cpu-consume.txt

    Feature->>FS: record timestamp "deployment" -> time-stamp.txt
    Feature->>FS: collect memory snapshot -> memory-*.txt

    Feature->>FS: record timestamp "stop" -> time-stamp.txt
    Feature->>Hooks: Scenario ends
    Hooks->>Monitor: Stop()
    Monitor->>Monitor: cancel context, wait for loop exit
    Monitor->>FS: final samples flushed
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Pay attention to goroutine lifecycle, context cancellation, and WaitGroup usage in performance.go.
  • Verify mutex/state handling to prevent double Start/Stop races.
  • Confirm file paths and formatting for cpu-consume.txt and time-stamp.txt in testsuite.go.
  • Validate step regex replacement and hook ordering in testsuite.go relative to feature timestamps.

Possibly related PRs

Suggested labels

lgtm

Suggested reviewers

  • albfan
  • adrianriobo
  • anjannath

Poem

🐰 I hopped through tests with ears aflutter,

I logged each tick and skipped the clutter.
A Monitor hums while I sip carrot tea,
Timestamps dancing, CPU kept free —
Happy hops for stable telemetry! 🥕

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Description check ⚠️ Warning The description uses the repository template but lacks critical details: no concrete explanation of changes, no linked issues, empty Proposed changes section, and unchecked test verification boxes. Fill in Description section with plain English explanation of what was solved; add linked issue numbers; list specific proposed changes in Proposed changes section; verify and mark testing boxes accordingly.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: adding CPU usage monitoring to e2e test cases through a new Monitor mechanism and timestamp collection.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (4)
test/e2e/testsuite/performace.go (3)

15-15: Minor: Fix typos in comments.

Multiple comments contain typos:

  • Line 15: "cancle" should be "cancel"
  • Line 41: "stary" should be "start"
  • Line 53: "cancle" should be "cancel"
  • Line 70: "cancled" should be "cancelled"

Apply these corrections:

-	// cancle flag of monitoring
+	// cancel flag of monitoring
 	cancelFunc context.CancelFunc

-	// stary goroutine
+	// start goroutine
 	go m.collectLoop(ctx)

-	// call cancle function
+	// call cancel function
 	if m.cancelFunc != nil {

-		// 1. check Context whether be cancled
+		// 1. check Context whether be cancelled
 		select {

Also applies to: 41-41, 53-53, 70-70


34-34: Minor: Missing newline characters in fmt.Printf calls.

Lines 34 and 84 use fmt.Printf without a trailing newline, which can cause output formatting issues.

Add newline characters:

-	fmt.Printf("Attempt to start CPU collector, interval: %s", m.interval)
+	fmt.Printf("Attempt to start CPU collector, interval: %s\n", m.interval)

-		fmt.Printf("Error: fail to collect CPU data: %v", err)
+		fmt.Printf("Error: fail to collect CPU data: %v\n", err)

Also applies to: 84-84


6-12: Minor: Fix import formatting and comment spacing.

Static analysis indicates the file is not properly formatted. Run goimports and gofmt to fix import ordering and the missing space in the comment on line 11.

Based on static analysis hints, run:

#!/bin/bash
goimports -w test/e2e/testsuite/performace.go
gofmt -w test/e2e/testsuite/performace.go
test/e2e/testsuite/testsuite.go (1)

1344-1345: Minor: Remove unnecessary blank lines.

Lines 1344-1345 contain unnecessary blank lines that violate Go formatting conventions.

Remove the extra blank lines:

 	return util.WriteToFile(data, file)
 }
-
-
-
 func getTimestamp(content string) error {
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8d2c93c and 6b5199c.

📒 Files selected for processing (4)
  • test/e2e/features/story_microshift.feature (3 hunks)
  • test/e2e/features/story_openshift.feature (2 hunks)
  • test/e2e/testsuite/performace.go (1 hunks)
  • test/e2e/testsuite/testsuite.go (5 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
test/e2e/testsuite/testsuite.go (2)
test/e2e/testsuite/performace.go (1)
  • NewMonitor (23-27)
test/extended/util/fileops.go (1)
  • WriteToFile (93-108)
test/e2e/testsuite/performace.go (1)
test/extended/util/fileops.go (1)
  • WriteToFile (93-108)
🪛 GitHub Check: build (ubuntu-22.04, 1.24)
test/e2e/testsuite/testsuite.go

[failure] 269-269:
printf: fmt.Println call has possible Printf formatting directive %v (govet)


[failure] 400-400:
printf: fmt.Println call has possible Printf formatting directive %v (govet)


[failure] 1352-1352:
printf: fmt.Println call has possible Printf formatting directive %v (govet)


[failure] 1345-1345:
File is not properly formatted (gofmt)


[failure] 1355-1355:
Error return value of util.WriteToFile is not checked (errcheck)

test/e2e/testsuite/performace.go

[failure] 8-8:
File is not properly formatted (goimports)


[failure] 6-6:
File is not properly formatted (gofmt)


[failure] 11-11:
commentFormatting: put a space between // and comment text (gocritic)


[failure] 94-94:
Error return value of util.WriteToFile is not checked (errcheck)

🪛 GitHub Check: build (ubuntu-latest, 1.24)
test/e2e/testsuite/testsuite.go

[failure] 269-269:
printf: fmt.Println call has possible Printf formatting directive %v (govet)


[failure] 400-400:
printf: fmt.Println call has possible Printf formatting directive %v (govet)


[failure] 1352-1352:
printf: fmt.Println call has possible Printf formatting directive %v (govet)


[failure] 1345-1345:
File is not properly formatted (gofmt)


[failure] 1355-1355:
Error return value of util.WriteToFile is not checked (errcheck)

test/e2e/testsuite/performace.go

[failure] 8-8:
File is not properly formatted (goimports)


[failure] 6-6:
File is not properly formatted (gofmt)


[failure] 11-11:
commentFormatting: put a space between // and comment text (gocritic)


[failure] 94-94:
Error return value of util.WriteToFile is not checked (errcheck)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: build (macOS-14, 1.24)
  • GitHub Check: build (macOS-13, 1.24)
  • GitHub Check: build-qe (windows, amd64)
  • GitHub Check: build-qe (linux, arm64)
  • GitHub Check: build-installer (windows-2022, 1.24)
  • GitHub Check: build (windows-2022, 1.24)
  • GitHub Check: Run OKD bundle with crc (1.24)
  • GitHub Check: build-qe (darwin, arm64)
  • GitHub Check: build-qe (linux, amd64)
  • GitHub Check: build-qe (darwin, amd64)
  • GitHub Check: build (ubuntu-latest, 1.24)
  • GitHub Check: verify-devcontainer
  • GitHub Check: build (macOS-13, 1.24)
  • GitHub Check: Konflux kflux-prd-rh02 / crc-binary-on-pull-request
🔇 Additional comments (1)
test/e2e/testsuite/testsuite.go (1)

171-171: The monitor lifecycle concerns are addressed—code is correct.

Verification confirms:

  1. All scenarios with @performance tag properly manage Start/Stop (only 1 scenario uses it, both in conditional hooks)
  2. Monitor cleanup happens even on scenario failure—Godog's After hook executes regardless of outcome
  3. The 5-second wait is intentional, documented in the code itself: "Wait for 5 seconds to confirm that the collection task will no longer output data"

Note: There is a minor bug at lines 268 and 399 using fmt.Println with format strings instead of fmt.Printf, which means error messages won't display correctly. This is separate from the lifecycle concerns.

Likely an incorrect or invalid review comment.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
test/e2e/features/story_microshift.feature (1)

22-41: Remove orphan step: “get cpu data” no longer has a step definition.

CPU collection moved to background Monitor; this step will fail at runtime.

Apply:

-		When with up to "20" retries with wait period of "5s" http response from "http://httpd-example-testproj.apps.crc.testing" has status code "200"
-		And get cpu data "After deployment"
+		When with up to "20" retries with wait period of "5s" http response from "http://httpd-example-testproj.apps.crc.testing" has status code "200"
+		# CPU data is collected by the background monitor; explicit CPU step removed
♻️ Duplicate comments (2)
test/e2e/testsuite/performance.go (1)

98-104: Handle filesystem errors when writing samples (Getwd/WriteToFile).

Errors are ignored; causes silent data loss and CI failures. Same concern was raised earlier.

Apply:

-			wd, _ := os.Getwd()
-			file := filepath.Join(wd, "../test-results/cpu-consume.txt")
-			util.WriteToFile(data, file)
+			wd, err := os.Getwd()
+			if err != nil {
+				fmt.Printf("Error: failed to get working directory: %v\n", err)
+				continue
+			}
+			file := filepath.Join(wd, "../test-results/cpu-consume.txt")
+			if err := os.MkdirAll(filepath.Dir(file), 0o755); err != nil {
+				fmt.Printf("Error: failed to create results dir: %v\n", err)
+				continue
+			}
+			if err := util.WriteToFile(data, file); err != nil {
+				fmt.Printf("Error: failed to write CPU data: %v\n", err)
+			}
test/e2e/testsuite/testsuite.go (1)

1346-1355: Return error from getTimestamp; fix printf and check WriteToFile (matches prior review).

Currently errors are dropped and fmt.Println uses a formatting verb.

Apply:

-func getTimestamp(content string) {
+func getTimestamp(content string) error {
 	data := fmt.Sprintf("[%s], %s\n",
 		time.Now().Format("15:04:05"), content)
 	wd, err := os.Getwd()
 	if err != nil {
-		fmt.Println("failed to get working directory: %v", err)
+		return fmt.Errorf("failed to get working directory: %v", err)
 	}
 	file := filepath.Join(wd, "../test-results/time-stamp.txt")
-	util.WriteToFile(data, file)
+	if err := os.MkdirAll(filepath.Dir(file), 0o755); err != nil {
+		return fmt.Errorf("failed to create results dir: %v", err)
+	}
+	return util.WriteToFile(data, file)
 }

Also update callers to handle the returned error (see Before hook change above).

🧹 Nitpick comments (2)
test/e2e/testsuite/performance.go (1)

38-51: Minor logging/comment polish.

Add newline to Printf; fix typos in comments (“start goroutine”, “cancel”, “canceled”). No behavior change.

-	fmt.Printf("Attempt to start CPU collector, interval: %s", m.interval)
+	fmt.Printf("Attempt to start CPU collector, interval: %s\n", m.interval)
-	//  create a context.WithCancel
+	// create a context.WithCancel
-	// stary goroutine
+	// start goroutine
test/e2e/testsuite/testsuite.go (1)

399-405: Nice fix on Printf; consider removing the extra 5s sleep.

Stop waits for the collector goroutine (wg.Wait), so an additional fixed 5s delay slows tests without benefit. If you want a guard, sleep for at most one collection interval.

-				fmt.Printf("Collection has stopped. Wait for 5 seconds to confirm that the collection task will no longer output data\n")
-				time.Sleep(5 * time.Second)
+				// Optional: allow up to one interval for any buffered I/O to flush
+				// time.Sleep(1 * time.Second)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6b5199c and 5a315a8.

📒 Files selected for processing (4)
  • test/e2e/features/story_microshift.feature (3 hunks)
  • test/e2e/features/story_openshift.feature (2 hunks)
  • test/e2e/testsuite/performance.go (1 hunks)
  • test/e2e/testsuite/testsuite.go (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/e2e/features/story_openshift.feature
🧰 Additional context used
🧬 Code graph analysis (2)
test/e2e/testsuite/performance.go (1)
test/extended/util/fileops.go (1)
  • WriteToFile (93-108)
test/e2e/testsuite/testsuite.go (2)
test/e2e/testsuite/performance.go (1)
  • NewMonitor (24-28)
test/extended/util/fileops.go (1)
  • WriteToFile (93-108)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (16)
  • GitHub Check: build (windows-2022, 1.24)
  • GitHub Check: Run OKD bundle with crc (1.24)
  • GitHub Check: build (ubuntu-latest, 1.24)
  • GitHub Check: Konflux kflux-prd-rh02 / crc-binary-on-pull-request
  • GitHub Check: build-qe (windows, amd64)
  • GitHub Check: build-qe (darwin, amd64)
  • GitHub Check: build (ubuntu-22.04, 1.24)
  • GitHub Check: build-qe (darwin, arm64)
  • GitHub Check: build-qe (linux, arm64)
  • GitHub Check: build-qe (linux, amd64)
  • GitHub Check: build (ubuntu-latest, 1.24)
  • GitHub Check: build (macOS-14, 1.24)
  • GitHub Check: build (macOS-13, 1.24)
  • GitHub Check: build (macOS-14, 1.24)
  • GitHub Check: build-installer (windows-2022, 1.24)
  • GitHub Check: verify-devcontainer
🔇 Additional comments (3)
test/e2e/features/story_microshift.feature (1)

11-11: Timestamp step addition looks good.

test/e2e/testsuite/testsuite.go (2)

171-171: Monitor instance scope: confirm no parallel scenarios.

Single monitor captured by closures; parallel execution would interleave Start/Stop. If parallelism is possible, create one per scenario (in Before) or guard with a package-level singleton and a refcount.


587-589: Step should propagate failures from timestamp writer.

After updating getTimestamp to return error (see below), keeping this binding will surface I/O issues to Godog.

@lilyLuLiu lilyLuLiu force-pushed the main branch 2 times, most recently from 776af4f to 7bcaf7c Compare October 24, 2025 03:30
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
test/e2e/testsuite/testsuite.go (1)

1350-1359: Return error immediately when os.Getwd() fails.

When os.Getwd() fails (line 1354), the code prints an error but continues to use the invalid working directory to construct the file path. This will cause util.WriteToFile to fail. Instead, return the error immediately, consistent with how getMemoryData handles the same error (lines 1342-1345).

Apply this diff:

 func getTimestamp(content string) error {
 	data := fmt.Sprintf("[%s], %s\n",
 		time.Now().Format("15:04:05"), content)
 	wd, err := os.Getwd()
 	if err != nil {
-		fmt.Printf("failed to get working directory: %v\n", err)
+		return fmt.Errorf("failed to get working directory: %v", err)
 	}
 	file := filepath.Join(wd, "../test-results/time-stamp.txt")
 	return util.WriteToFile(data, file)
 }
test/e2e/testsuite/performance.go (1)

95-104: Handle os.Getwd() error properly.

When os.Getwd() fails (line 98), the code continues with an invalid path, which will cause util.WriteToFile to fail. Instead of ignoring this error, handle it and continue to the next iteration.

Apply this diff:

 		if len(totalPercent) > 0 {
 			data := fmt.Sprintf("[%s], cpu percent: %.2f%%\n",
 				time.Now().Format("15:04:05"), totalPercent[0])
-			wd, _ := os.Getwd()
+			wd, err := os.Getwd()
+			if err != nil {
+				fmt.Printf("Error: failed to get working directory: %v\n", err)
+				continue
+			}
 			file := filepath.Join(wd, "../test-results/cpu-consume.txt")
 			err := util.WriteToFile(data, file)
 			if err != nil {
 				fmt.Printf("Error: fail to write to %s: %v", file, err)
 			}
 		}
🧹 Nitpick comments (7)
test/e2e/testsuite/performance.go (4)

37-37: Add trailing newline to log message.

The log message on line 37 is missing a trailing newline, which will cause the next log output to appear on the same line.

Apply this diff:

-	fmt.Printf("Attempt to start CPU collector, interval: %s", m.interval)
+	fmt.Printf("Attempt to start CPU collector, interval: %s\n", m.interval)

52-67: Release mutex before calling cancelFunc and wg.Wait().

While the current implementation is safe, it's better practice to release the mutex before calling cancelFunc() and wg.Wait(). Holding the mutex during these blocking operations is unnecessary and could cause issues if the cancellation triggers any callbacks that need the mutex.

Apply this diff:

 func (m *Monitor) Stop() error {
 	m.mu.Lock()
-	defer m.mu.Unlock()
 	if !m.isRunning {
+		m.mu.Unlock()
 		return fmt.Errorf("The collector is not running")
 	}
+	cancel := m.cancelFunc
+	m.isRunning = false
+	m.mu.Unlock()
+	
 	if m.cancelFunc != nil {
-		m.cancelFunc()
+		cancel()
 	}
-	m.isRunning = false
 	m.wg.Wait()
 	fmt.Println("CPU collector has sent a stop signal")
-	// may need wait a while to stop
 	return nil
 }

76-76: Fix typo in comment.

"cancled" should be "cancelled".

Apply this diff:

-		// 1. check Context whether be cancled
+		// 1. check Context whether be cancelled

105-117: Remove or document commented code.

The large block of commented-out memory collection code should either be removed if it's no longer needed, or documented with a TODO/NOTE comment explaining why it's kept for future use.

test/e2e/testsuite/testsuite.go (3)

267-272: Consider recording a start timestamp.

Based on past review comments, it was suggested to record a "start" timestamp when entering the @performance scenario. Currently, only a "finish" timestamp is recorded in the After hook (line 400). Recording both start and finish timestamps would provide complete timing data for performance analysis.

Apply this diff:

 		if tag.Name == "@performance" {
-			
+			if err := getTimestamp("start"); err != nil {
+				fmt.Printf("Failed to record start timestamp: %v\n", err)
+			}
 			if err := monitor.Start(); err != nil {
 				fmt.Printf("Failed to start monitor: %v\n", err)
 			}
 		}

400-403: Improve error message wording.

The error message "Failed to get finish getTimestamp" is awkwardly phrased. Consider rephrasing to "Failed to record finish timestamp" for clarity.

Apply this diff:

 			err := getTimestamp("finish")
 			if err != nil {
-				fmt.Printf("Failed to get finish getTimestamp: %v\n", err)
+				fmt.Printf("Failed to record finish timestamp: %v\n", err)
 			}

407-407: Use fmt.Println instead of fmt.Printf for plain strings.

When printing a plain string without format specifiers, use fmt.Println instead of fmt.Printf for consistency and clarity.

Apply this diff:

-			fmt.Printf("Collection has stopped. Wait for 5 seconds to confirm that the collection task will no longer output data\n")
+			fmt.Println("Collection has stopped. Wait for 5 seconds to confirm that the collection task will no longer output data")
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5a315a8 and 7bcaf7c.

📒 Files selected for processing (4)
  • test/e2e/features/story_microshift.feature (3 hunks)
  • test/e2e/features/story_openshift.feature (2 hunks)
  • test/e2e/testsuite/performance.go (1 hunks)
  • test/e2e/testsuite/testsuite.go (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • test/e2e/features/story_microshift.feature
  • test/e2e/features/story_openshift.feature
🧰 Additional context used
🧬 Code graph analysis (2)
test/e2e/testsuite/performance.go (1)
test/extended/util/fileops.go (1)
  • WriteToFile (93-108)
test/e2e/testsuite/testsuite.go (2)
test/e2e/testsuite/performance.go (1)
  • NewMonitor (23-27)
test/extended/util/fileops.go (1)
  • WriteToFile (93-108)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
  • GitHub Check: verify-devcontainer
  • GitHub Check: build (ubuntu-latest, 1.24)
  • GitHub Check: build-qe (linux, amd64)
  • GitHub Check: build-qe (linux, arm64)
  • GitHub Check: build-qe (windows, amd64)
  • GitHub Check: build-qe (darwin, arm64)
  • GitHub Check: build-qe (darwin, amd64)
  • GitHub Check: build-installer (windows-2022, 1.24)
  • GitHub Check: Run OKD bundle with crc (1.24)
  • GitHub Check: build (macOS-14, 1.24)
  • GitHub Check: build (macOS-13, 1.24)
  • GitHub Check: build (windows-2022, 1.24)
  • GitHub Check: build (windows-2022, 1.24)
  • GitHub Check: build (macOS-14, 1.24)
  • GitHub Check: build (ubuntu-22.04, 1.24)
  • GitHub Check: build (ubuntu-latest, 1.24)
  • GitHub Check: build (macOS-13, 1.24)
  • GitHub Check: Konflux kflux-prd-rh02 / crc-binary-on-pull-request

@lilyLuLiu lilyLuLiu force-pushed the main branch 3 times, most recently from e725ad3 to f9dd334 Compare October 24, 2025 07:09
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
test/e2e/testsuite/testsuite.go (1)

1354-1363: Critical: Return immediately when os.Getwd() fails.

If os.Getwd() fails at line 1358, the error is logged but the function continues to construct a file path with a potentially empty wd variable. This leads to an invalid path being passed to util.WriteToFile, and the root cause error is lost, making debugging difficult.

Apply this diff to return the error immediately:

 func getTimestamp(content string) error {
 	data := fmt.Sprintf("[%s], %s\n",
 		time.Now().Format("15:04:05"), content)
 	wd, err := os.Getwd()
 	if err != nil {
-		fmt.Printf("failed to get working directory: %v\n", err)
+		return fmt.Errorf("failed to get working directory: %v", err)
 	}
 	file := filepath.Join(wd, "../test-results/time-stamp.txt")
 	return util.WriteToFile(data, file)
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e725ad3 and f9dd334.

📒 Files selected for processing (4)
  • test/e2e/features/story_microshift.feature (2 hunks)
  • test/e2e/features/story_openshift.feature (2 hunks)
  • test/e2e/testsuite/performance.go (1 hunks)
  • test/e2e/testsuite/testsuite.go (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • test/e2e/testsuite/performance.go
  • test/e2e/features/story_openshift.feature
🧰 Additional context used
🧬 Code graph analysis (1)
test/e2e/testsuite/testsuite.go (2)
test/e2e/testsuite/performance.go (1)
  • NewMonitor (23-27)
test/extended/util/fileops.go (1)
  • WriteToFile (93-108)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)
  • GitHub Check: verify-devcontainer
  • GitHub Check: build (ubuntu-latest, 1.24)
  • GitHub Check: build (windows-2022, 1.24)
  • GitHub Check: build (macOS-14, 1.24)
  • GitHub Check: build (macOS-13, 1.24)
  • GitHub Check: build (macOS-13, 1.24)
  • GitHub Check: build-qe (linux, arm64)
  • GitHub Check: build (ubuntu-22.04, 1.24)
  • GitHub Check: build (macOS-14, 1.24)
  • GitHub Check: build (ubuntu-latest, 1.24)
  • GitHub Check: Run OKD bundle with crc (1.24)
  • GitHub Check: build-qe (linux, amd64)
  • GitHub Check: build-qe (windows, amd64)
  • GitHub Check: build-qe (darwin, arm64)
  • GitHub Check: build-qe (darwin, amd64)
  • GitHub Check: build-installer (windows-2022, 1.24)
  • GitHub Check: Konflux kflux-prd-rh02 / crc-binary-on-pull-request
🔇 Additional comments (4)
test/e2e/features/story_microshift.feature (1)

20-43: LGTM! Timestamp recording steps are well-placed.

The addition of the @performance tag and timestamp recording steps aligns with the PR's objective to implement timestamp-based performance monitoring. The placement of timestamps around deployment, stop, and restart operations is logical and will enable performance analysis of these critical operations.

test/e2e/testsuite/testsuite.go (3)

171-171: LGTM! Monitor initialization is appropriate.

The monitor is correctly initialized with a 1-second sampling interval and scoped to the scenario lifecycle.


403-413: LGTM! Performance monitoring teardown is correctly implemented.

The cleanup sequence properly records the finish timestamp, stops the monitor with appropriate error handling, and includes a verification delay.


595-596: LGTM! Step binding is correctly defined.

The regex pattern and function binding are appropriate for capturing timestamp labels from the feature files.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
test/e2e/features/story_microshift.feature (1)

44-48: Consider adding memory data collection after restart for consistency.

Memory data is collected at key checkpoints: "After start" (line 11), "After deployment" (line 37), and "After stop" (line 42). However, there's no memory data collection after the restart on line 44. For comprehensive performance monitoring and consistency, consider adding a memory checkpoint after the CRC restart.

Apply this diff to add a memory checkpoint after restart:

 		And record timestamp "start again"
 		And starting CRC with default bundle succeeds
+		And get memory data "After restart"
 		And checking that CRC is running
 		And with up to "4" retries with wait period of "1m" http response from "http://httpd-example-testproj.apps.crc.testing" has status code "200"
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f9dd334 and 83e8fa6.

📒 Files selected for processing (4)
  • test/e2e/features/story_microshift.feature (2 hunks)
  • test/e2e/features/story_openshift.feature (2 hunks)
  • test/e2e/testsuite/performance.go (1 hunks)
  • test/e2e/testsuite/testsuite.go (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • test/e2e/testsuite/performance.go
  • test/e2e/testsuite/testsuite.go
  • test/e2e/features/story_openshift.feature
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)
  • GitHub Check: build-qe (linux, arm64)
  • GitHub Check: build-qe (darwin, arm64)
  • GitHub Check: build-qe (linux, amd64)
  • GitHub Check: build-qe (windows, amd64)
  • GitHub Check: build-qe (darwin, amd64)
  • GitHub Check: build-installer (windows-2022, 1.24)
  • GitHub Check: Run OKD bundle with crc (1.24)
  • GitHub Check: build (ubuntu-22.04, 1.24)
  • GitHub Check: build (ubuntu-latest, 1.24)
  • GitHub Check: build (macOS-14, 1.24)
  • GitHub Check: build (ubuntu-latest, 1.24)
  • GitHub Check: verify-devcontainer
  • GitHub Check: build (windows-2022, 1.24)
  • GitHub Check: build (windows-2022, 1.24)
  • GitHub Check: build (macOS-14, 1.24)
  • GitHub Check: build (macOS-13, 1.24)
  • GitHub Check: Konflux kflux-prd-rh02 / crc-binary-on-pull-request
🔇 Additional comments (2)
test/e2e/features/story_microshift.feature (2)

20-20: LGTM: @performance tag added for CPU monitoring.

The addition of the @performance tag aligns with the PR objective to monitor CPU usage in E2E tests. This tag triggers the Monitor lifecycle to sample CPU data periodically during scenario execution.


22-22: No issues found. The timestamp labels are correctly handled.

The getTimestamp function implementation (testsuite.go:1357-1366) shows that the timestamp label parameter is only used for string formatting when writing to a fixed log file (../test-results/time-stamp.txt). Spaces in labels like "start again" are handled correctly—they appear as plain text in the timestamp log without any adverse effects. The label is not used in file naming, identifier derivation, or path operations where spaces might cause issues.

Likely an incorrect or invalid review comment.

@lilyLuLiu lilyLuLiu force-pushed the main branch 3 times, most recently from d403855 to 5ae7224 Compare October 27, 2025 06:41
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d403855 and 5ae7224.

📒 Files selected for processing (2)
  • .github/workflows/windows-e2e.yml (1 hunks)
  • .github/workflows/windows-qe-tpl.yml (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/windows-qe-tpl.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (16)
  • GitHub Check: build-qe (linux, arm64)
  • GitHub Check: build-qe (darwin, arm64)
  • GitHub Check: build-qe (windows, amd64)
  • GitHub Check: build-qe (darwin, amd64)
  • GitHub Check: build-qe (linux, amd64)
  • GitHub Check: build (macOS-14, 1.24)
  • GitHub Check: build (macOS-13, 1.24)
  • GitHub Check: build (windows-2022, 1.24)
  • GitHub Check: build (ubuntu-latest, 1.24)
  • GitHub Check: build (macOS-14, 1.24)
  • GitHub Check: build (macOS-13, 1.24)
  • GitHub Check: build (ubuntu-latest, 1.24)
  • GitHub Check: verify-devcontainer
  • GitHub Check: build (ubuntu-22.04, 1.24)
  • GitHub Check: Run OKD bundle with crc (1.24)
  • GitHub Check: Konflux kflux-prd-rh02 / crc-binary-on-pull-request
🔇 Additional comments (1)
.github/workflows/windows-e2e.yml (1)

12-14: Verify: Expanding workflow trigger to all events (not just PRs).

The conditional logic was changed to remove the github.event.workflow_run.event == 'pull_request' check. This means the workflow now runs on all workflow_run completion events, including direct pushes to branches and other non-PR events.

Confirm this is intentional, as it may:

  • Increase CI costs by running more frequently
  • Execute in contexts where PR metadata may not be available
  • Require validation that the template correctly handles non-PR scenarios

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
.github/workflows/windows-e2e.yml (1)

15-15: Regression: Workflow template still references personal fork instead of official repository.

This is a regression of a previous critical issue. Line 15 references lilyLuLiu/crc/, which breaks reproducibility and prevents other contributors from running this workflow without access to the personal fork.

Apply this diff to reference the official repository:

-    uses: lilyLuLiu/crc/.github/workflows/windows-qe-tpl.yml@main
+    uses: crc-org/crc/.github/workflows/windows-qe-tpl.yml@main

Ensure windows-qe-tpl.yml is committed to crc-org/crc before merging this PR.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5ae7224 and de43424.

📒 Files selected for processing (2)
  • .github/workflows/windows-e2e.yml (1 hunks)
  • .github/workflows/windows-qe-tpl.yml (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (16)
  • GitHub Check: build-qe (linux, arm64)
  • GitHub Check: build-qe (darwin, amd64)
  • GitHub Check: build-qe (windows, amd64)
  • GitHub Check: build-qe (linux, amd64)
  • GitHub Check: build (macOS-14, 1.24)
  • GitHub Check: build (ubuntu-22.04, 1.24)
  • GitHub Check: build-installer (windows-2022, 1.24)
  • GitHub Check: build (ubuntu-latest, 1.24)
  • GitHub Check: build (macOS-14, 1.24)
  • GitHub Check: build (macOS-13, 1.24)
  • GitHub Check: build (macOS-13, 1.24)
  • GitHub Check: build (windows-2022, 1.24)
  • GitHub Check: build (ubuntu-latest, 1.24)
  • GitHub Check: Run OKD bundle with crc (1.24)
  • GitHub Check: verify-devcontainer
  • GitHub Check: Konflux kflux-prd-rh02 / crc-binary-on-pull-request
🔇 Additional comments (2)
.github/workflows/windows-qe-tpl.yml (2)

6-18: Workflow dispatch trigger is well-structured.

The workflow_dispatch inputs are clearly defined, properly typed, and mirror the workflow_call inputs below. This enables flexible manual triggering for testing and debugging.


129-131: Remove debug commands (ls -lh, whoami) that add unnecessary output to CI logs.

Lines 129–130 contain diagnostic commands that appear to be troubleshooting artifacts and should be removed. They serve no documented purpose in the workflow.

Line 131's sudo chown runner:runner host username id_rsa is necessary and correct—it ensures the runner user can read the connection details files created by the mapt container in the subsequent step ("Install CRC on host"), which reads these files via $(cat host) and $(cat username).

@openshift-ci
Copy link

openshift-ci bot commented Oct 28, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign gbraad for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
test/e2e/testsuite/testsuite.go (1)

267-279: Fix misleading error message.

Line 273 reports "Failed to get finish getTimestamp" when recording the "start" timestamp, which will confuse debugging efforts.

Apply this diff:

 		if tag.Name == "@performance" {
 			if err := monitor.Start(); err != nil {
 				fmt.Printf("Failed to start monitor: %v\n", err)
 			}
 			err := getTimestamp("start")
 			if err != nil {
-				fmt.Printf("Failed to get finish getTimestamp: %v\n", err)
+				fmt.Printf("Failed to record start timestamp: %v\n", err)
 			}
 			err = getMemoryData("Before start")
 			if err != nil {
 				fmt.Printf("Failed to get memory data: %v\n", err)
 			}
 		}
🧹 Nitpick comments (2)
test/e2e/testsuite/testsuite.go (1)

1357-1366: Suggest early return on os.Getwd error.

While the current implementation will eventually return an error if os.Getwd() fails (via WriteToFile returning an error), it's cleaner and more explicit to return immediately when the working directory cannot be determined.

Apply this diff:

 func getTimestamp(content string) error {
 	data := fmt.Sprintf("[%s], %s\n",
 		time.Now().Format("15:04:05"), content)
 	wd, err := os.Getwd()
 	if err != nil {
-		fmt.Printf("failed to get working directory: %v\n", err)
+		return fmt.Errorf("failed to get working directory: %v", err)
 	}
 	file := filepath.Join(wd, "../test-results/time-stamp.txt")
 	return util.WriteToFile(data, file)
 }
test/e2e/testsuite/performance.go (1)

109-122: Optional: Consider removing commented-out code.

The commented memory collection code suggests it was evaluated but not needed (since getMemoryData handles memory sampling at specific test points). Consider removing it to reduce clutter, or add a comment explaining why it's preserved.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between de43424 and 3a39ca8.

📒 Files selected for processing (4)
  • test/e2e/features/story_microshift.feature (2 hunks)
  • test/e2e/features/story_openshift.feature (2 hunks)
  • test/e2e/testsuite/performance.go (1 hunks)
  • test/e2e/testsuite/testsuite.go (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/e2e/features/story_openshift.feature
🧰 Additional context used
🧬 Code graph analysis (2)
test/e2e/testsuite/performance.go (1)
test/extended/util/fileops.go (1)
  • WriteToFile (93-108)
test/e2e/testsuite/testsuite.go (2)
test/e2e/testsuite/performance.go (1)
  • NewMonitor (23-27)
test/extended/util/fileops.go (1)
  • WriteToFile (93-108)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
  • GitHub Check: build (macOS-13, 1.24)
  • GitHub Check: build (macOS-14, 1.24)
  • GitHub Check: build (ubuntu-latest, 1.24)
  • GitHub Check: Run OKD bundle with crc (1.24)
  • GitHub Check: build-qe (darwin, amd64)
  • GitHub Check: build-qe (linux, amd64)
  • GitHub Check: build-qe (windows, amd64)
  • GitHub Check: build (ubuntu-22.04, 1.24)
  • GitHub Check: build (macOS-13, 1.24)
  • GitHub Check: build-qe (linux, arm64)
  • GitHub Check: verify-devcontainer
  • GitHub Check: build (macOS-14, 1.24)
  • GitHub Check: build-installer (windows-2022, 1.24)
  • GitHub Check: build (ubuntu-latest, 1.24)
  • GitHub Check: build-qe (darwin, arm64)
  • GitHub Check: build (windows-2022, 1.24)
  • GitHub Check: build (windows-2022, 1.24)
  • GitHub Check: Konflux kflux-prd-rh02 / crc-binary-on-pull-request
🔇 Additional comments (6)
test/e2e/features/story_microshift.feature (1)

20-48: LGTM! Performance monitoring integration is well-structured.

The addition of the @performance tag and timestamp recording steps at key points (deployment, stop, start again) aligns well with the new background monitoring approach. The removal of inline CPU data checks in favor of continuous monitoring is a cleaner design.

test/e2e/testsuite/testsuite.go (3)

171-171: LGTM! Appropriate sampling interval.

The 1-second interval for CPU monitoring strikes a good balance between data granularity and overhead.


406-416: LGTM! Proper cleanup sequence.

The After hook correctly records the finish timestamp, stops the monitor, and includes a reasonable grace period for the collector goroutine to complete.


598-599: LGTM! Step binding is correct.

The timestamp recording step is properly bound to the getTimestamp function.

test/e2e/testsuite/performance.go (2)

15-27: LGTM! Well-designed Monitor structure.

The Monitor type includes appropriate synchronization primitives (mutex, wait group) and lifecycle management (context cancellation). The constructor correctly initializes only the interval, relying on zero values for other fields.


52-67: LGTM! Stop method is properly synchronized.

The Stop method correctly uses mutex protection and ensures the collector goroutine has fully terminated before returning. While calling wg.Wait() under the lock is somewhat unusual, it's safe here since collectLoop doesn't acquire the mutex, and it prevents races between Stop and subsequent Start calls.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
test/e2e/testsuite/testsuite.go (2)

267-279: Fix misleading error message for start timestamp.

Line 273 reports "Failed to get finish getTimestamp" when recording the "start" timestamp, making debugging confusing. This was flagged in previous reviews but not fully corrected.

Apply this diff:

 		err := getTimestamp("start")
 		if err != nil {
-			fmt.Printf("Failed to get finish getTimestamp: %v\n", err)
+			fmt.Printf("Failed to record start timestamp: %v\n", err)
 		}

1357-1366: Critical: Handle os.Getwd error properly to prevent invalid file paths.

If os.Getwd() fails at line 1361, the function prints an error but continues executing with an empty wd string, resulting in an invalid file path at line 1364. This can cause silent failures or write data to unexpected locations.

Apply this diff:

 func getTimestamp(content string) error {
 	data := fmt.Sprintf("[%s], %s\n",
 		time.Now().Format("15:04:05"), content)
 	wd, err := os.Getwd()
 	if err != nil {
-		fmt.Printf("failed to get working directory: %v\n", err)
+		return fmt.Errorf("failed to get working directory: %v", err)
 	}
 	file := filepath.Join(wd, "../test-results/time-stamp.txt")
 	return util.WriteToFile(data, file)
 }
🧹 Nitpick comments (3)
test/e2e/testsuite/performance.go (3)

44-44: Fix typo in comment.

"stary" should be "start".

-	// stary goroutine
+	// start goroutine
 	m.wg.Add(1)

76-76: Fix typo in comment.

"cancled" should be "cancelled".

-		// 1. check Context whether be cancled
+		// 1. check Context whether be cancelled
 		select {

52-67: Consider releasing mutex before waiting on WaitGroup.

Lines 53-63: The current implementation calls m.wg.Wait() while holding the mutex lock. While this is safe because collectLoop doesn't acquire the mutex, releasing the lock before waiting is generally better practice and would allow concurrent calls to Stop() to fail-fast rather than block.

 func (m *Monitor) Stop() error {
 	m.mu.Lock()
-	defer m.mu.Unlock()
 
 	if !m.isRunning {
+		m.mu.Unlock()
 		return fmt.Errorf("The collector is not running")
 	}
+	
+	// Copy cancelFunc and update state while holding lock
+	cancelFunc := m.cancelFunc
+	m.isRunning = false
+	m.mu.Unlock()
+	
+	// Call cancel and wait without holding the lock
-	if m.cancelFunc != nil {
-		m.cancelFunc()
+	if cancelFunc != nil {
+		cancelFunc()
 	}
-	m.isRunning = false
 	m.wg.Wait()
+	
 	fmt.Println("CPU collector has sent a stop signal")
-	// may need wait a while to stop
 	return nil
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3a39ca8 and ecf86bd.

📒 Files selected for processing (4)
  • test/e2e/features/story_microshift.feature (2 hunks)
  • test/e2e/features/story_openshift.feature (2 hunks)
  • test/e2e/testsuite/performance.go (1 hunks)
  • test/e2e/testsuite/testsuite.go (6 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/e2e/features/story_microshift.feature
🧰 Additional context used
🧬 Code graph analysis (2)
test/e2e/testsuite/performance.go (1)
test/extended/util/fileops.go (1)
  • WriteToFile (93-108)
test/e2e/testsuite/testsuite.go (2)
test/e2e/testsuite/performance.go (1)
  • NewMonitor (23-27)
test/extended/util/fileops.go (1)
  • WriteToFile (93-108)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: build (windows-2022, 1.24)
  • GitHub Check: build (macOS-14, 1.24)
  • GitHub Check: build (macOS-13, 1.24)
  • GitHub Check: Konflux kflux-prd-rh02 / crc-binary-on-pull-request
🔇 Additional comments (5)
test/e2e/features/story_openshift.feature (1)

12-41: LGTM! Timestamp-based monitoring approach is well-integrated.

The addition of the @performance tag and replacement of CPU data checks with timestamp recording steps aligns well with the new background Monitor approach. Memory data collection is appropriately retained at key scenario points.

test/e2e/testsuite/testsuite.go (2)

171-171: LGTM! Monitor initialization is appropriate.

Creating the monitor with a 1-second sampling interval provides good granularity for CPU usage tracking in e2e tests.


598-601: LGTM! Step binding correctly updated.

The step binding change from CPU data collection to timestamp recording aligns with the new monitoring architecture.

test/e2e/testsuite/performance.go (2)

15-67: LGTM! Monitor lifecycle management is well-implemented.

The mutex-protected state transitions, context cancellation, and WaitGroup coordination provide proper control over the background collector goroutine. Error handling for invalid state transitions (starting when running, stopping when not running) is appropriate.


69-123: LGTM! CPU data collection loop is robust.

The collectLoop properly handles context cancellation, collection errors, and I/O errors. The use of cpu.Percent(calcInterval, false) with the interval as the measurement duration eliminates the need for explicit sleep calls, and error recovery with 1-second retry delays prevents tight error loops.

@openshift-ci
Copy link

openshift-ci bot commented Nov 7, 2025

@lilyLuLiu: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/security ecf86bd link false /test security
ci/prow/e2e-microshift-crc ecf86bd link true /test e2e-microshift-crc
ci/prow/integration-crc ecf86bd link true /test integration-crc
ci/prow/e2e-crc ecf86bd link true /test e2e-crc

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Comment on lines +414 to +415
fmt.Printf("Collection has stopped. Wait for 5 seconds to confirm that the collection task will no longer output data\n")
time.Sleep(5 * time.Second)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if possible, better avoid any explicit (long) Sleep
maybe something like

while ! monitor.Finished():
  sleep 0.1

or even, assume it will finish safely and just skip the sleep?

fmt.Printf("Error: failed to get working directory: %v\n", err)
continue
}
file := filepath.Join(wd, "../test-results/cpu-consume.txt")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering about this hardcoded path. Won't new calls overwrite the old data?
or add a comment to say that this is safe, because the current directory is guaranteed to be clean

(and I'm surprized by the initial ../, I usually prefer when everything is stored inside the same directory (but it again depends on how . is controlled))

calcInterval := m.interval

for {
// 1. check Context whether be cancled
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

m.cancelFunc = cancel
m.isRunning = true

// stary goroutine
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

Comment on lines +109 to +121
/*
vMem, err := mem.VirtualMemory()
if err != nil {
fmt.Printf("Error: failed to collect memory data: %v", err)
continue
}
memoryused := vMem.Used / 1024 / 1024
data := fmt.Sprintf("[%s], MemoryUsed(MB): %d\n" ,
time.Now().Format("15:04:05"), memoryused)
wd, _ := os.Getwd()
file := filepath.Join(wd, "../test-results/memory-consume.txt")
util.WriteToFile(data, file)
*/
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment should be removed, it's not the write place to measure the memory usage,
this is after the blocking call to cpu.Percent(calcInterval, false), and I think there's usually no need to collect the memory as often as the CPU usage

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Work In Progress

Development

Successfully merging this pull request may close these issues.

2 participants