-
Notifications
You must be signed in to change notification settings - Fork 256
[QE]monitor cpu usage in e2e test case #4977
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughReplaces per-step CPU checks with a background Monitor that samples CPU periodically, adds timestamp recording steps in feature scenarios, tags scenarios with Changes
Sequence Diagram(s)sequenceDiagram
participant Feature
participant Hooks
participant Monitor
participant FS as "File System"
Note over Hooks,Monitor `#DDEBF7`: `@performance` scenario lifecycle
Feature->>Hooks: Scenario with `@performance` begins
Hooks->>Monitor: Start()
Monitor->>Monitor: spawn collectLoop (periodic CPU sample)
Monitor->>FS: append cpu sample -> cpu-consume.txt
Feature->>FS: record timestamp "deployment" -> time-stamp.txt
Feature->>FS: collect memory snapshot -> memory-*.txt
Feature->>FS: record timestamp "stop" -> time-stamp.txt
Feature->>Hooks: Scenario ends
Hooks->>Monitor: Stop()
Monitor->>Monitor: cancel context, wait for loop exit
Monitor->>FS: final samples flushed
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
🧹 Nitpick comments (4)
test/e2e/testsuite/performace.go (3)
15-15: Minor: Fix typos in comments.Multiple comments contain typos:
- Line 15: "cancle" should be "cancel"
- Line 41: "stary" should be "start"
- Line 53: "cancle" should be "cancel"
- Line 70: "cancled" should be "cancelled"
Apply these corrections:
- // cancle flag of monitoring + // cancel flag of monitoring cancelFunc context.CancelFunc - // stary goroutine + // start goroutine go m.collectLoop(ctx) - // call cancle function + // call cancel function if m.cancelFunc != nil { - // 1. check Context whether be cancled + // 1. check Context whether be cancelled select {Also applies to: 41-41, 53-53, 70-70
34-34: Minor: Missing newline characters in fmt.Printf calls.Lines 34 and 84 use
fmt.Printfwithout a trailing newline, which can cause output formatting issues.Add newline characters:
- fmt.Printf("Attempt to start CPU collector, interval: %s", m.interval) + fmt.Printf("Attempt to start CPU collector, interval: %s\n", m.interval) - fmt.Printf("Error: fail to collect CPU data: %v", err) + fmt.Printf("Error: fail to collect CPU data: %v\n", err)Also applies to: 84-84
6-12: Minor: Fix import formatting and comment spacing.Static analysis indicates the file is not properly formatted. Run
goimportsandgofmtto fix import ordering and the missing space in the comment on line 11.Based on static analysis hints, run:
#!/bin/bash goimports -w test/e2e/testsuite/performace.go gofmt -w test/e2e/testsuite/performace.gotest/e2e/testsuite/testsuite.go (1)
1344-1345: Minor: Remove unnecessary blank lines.Lines 1344-1345 contain unnecessary blank lines that violate Go formatting conventions.
Remove the extra blank lines:
return util.WriteToFile(data, file) } - - - func getTimestamp(content string) error {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
test/e2e/features/story_microshift.feature(3 hunks)test/e2e/features/story_openshift.feature(2 hunks)test/e2e/testsuite/performace.go(1 hunks)test/e2e/testsuite/testsuite.go(5 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
test/e2e/testsuite/testsuite.go (2)
test/e2e/testsuite/performace.go (1)
NewMonitor(23-27)test/extended/util/fileops.go (1)
WriteToFile(93-108)
test/e2e/testsuite/performace.go (1)
test/extended/util/fileops.go (1)
WriteToFile(93-108)
🪛 GitHub Check: build (ubuntu-22.04, 1.24)
test/e2e/testsuite/testsuite.go
[failure] 269-269:
printf: fmt.Println call has possible Printf formatting directive %v (govet)
[failure] 400-400:
printf: fmt.Println call has possible Printf formatting directive %v (govet)
[failure] 1352-1352:
printf: fmt.Println call has possible Printf formatting directive %v (govet)
[failure] 1345-1345:
File is not properly formatted (gofmt)
[failure] 1355-1355:
Error return value of util.WriteToFile is not checked (errcheck)
test/e2e/testsuite/performace.go
[failure] 8-8:
File is not properly formatted (goimports)
[failure] 6-6:
File is not properly formatted (gofmt)
[failure] 11-11:
commentFormatting: put a space between // and comment text (gocritic)
[failure] 94-94:
Error return value of util.WriteToFile is not checked (errcheck)
🪛 GitHub Check: build (ubuntu-latest, 1.24)
test/e2e/testsuite/testsuite.go
[failure] 269-269:
printf: fmt.Println call has possible Printf formatting directive %v (govet)
[failure] 400-400:
printf: fmt.Println call has possible Printf formatting directive %v (govet)
[failure] 1352-1352:
printf: fmt.Println call has possible Printf formatting directive %v (govet)
[failure] 1345-1345:
File is not properly formatted (gofmt)
[failure] 1355-1355:
Error return value of util.WriteToFile is not checked (errcheck)
test/e2e/testsuite/performace.go
[failure] 8-8:
File is not properly formatted (goimports)
[failure] 6-6:
File is not properly formatted (gofmt)
[failure] 11-11:
commentFormatting: put a space between // and comment text (gocritic)
[failure] 94-94:
Error return value of util.WriteToFile is not checked (errcheck)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
- GitHub Check: build (macOS-14, 1.24)
- GitHub Check: build (macOS-13, 1.24)
- GitHub Check: build-qe (windows, amd64)
- GitHub Check: build-qe (linux, arm64)
- GitHub Check: build-installer (windows-2022, 1.24)
- GitHub Check: build (windows-2022, 1.24)
- GitHub Check: Run OKD bundle with crc (1.24)
- GitHub Check: build-qe (darwin, arm64)
- GitHub Check: build-qe (linux, amd64)
- GitHub Check: build-qe (darwin, amd64)
- GitHub Check: build (ubuntu-latest, 1.24)
- GitHub Check: verify-devcontainer
- GitHub Check: build (macOS-13, 1.24)
- GitHub Check: Konflux kflux-prd-rh02 / crc-binary-on-pull-request
🔇 Additional comments (1)
test/e2e/testsuite/testsuite.go (1)
171-171: The monitor lifecycle concerns are addressed—code is correct.Verification confirms:
- All scenarios with
@performancetag properly manage Start/Stop (only 1 scenario uses it, both in conditional hooks)- Monitor cleanup happens even on scenario failure—Godog's After hook executes regardless of outcome
- The 5-second wait is intentional, documented in the code itself: "Wait for 5 seconds to confirm that the collection task will no longer output data"
Note: There is a minor bug at lines 268 and 399 using
fmt.Printlnwith format strings instead offmt.Printf, which means error messages won't display correctly. This is separate from the lifecycle concerns.Likely an incorrect or invalid review comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
test/e2e/features/story_microshift.feature (1)
22-41: Remove orphan step: “get cpu data” no longer has a step definition.CPU collection moved to background Monitor; this step will fail at runtime.
Apply:
- When with up to "20" retries with wait period of "5s" http response from "http://httpd-example-testproj.apps.crc.testing" has status code "200" - And get cpu data "After deployment" + When with up to "20" retries with wait period of "5s" http response from "http://httpd-example-testproj.apps.crc.testing" has status code "200" + # CPU data is collected by the background monitor; explicit CPU step removed
♻️ Duplicate comments (2)
test/e2e/testsuite/performance.go (1)
98-104: Handle filesystem errors when writing samples (Getwd/WriteToFile).Errors are ignored; causes silent data loss and CI failures. Same concern was raised earlier.
Apply:
- wd, _ := os.Getwd() - file := filepath.Join(wd, "../test-results/cpu-consume.txt") - util.WriteToFile(data, file) + wd, err := os.Getwd() + if err != nil { + fmt.Printf("Error: failed to get working directory: %v\n", err) + continue + } + file := filepath.Join(wd, "../test-results/cpu-consume.txt") + if err := os.MkdirAll(filepath.Dir(file), 0o755); err != nil { + fmt.Printf("Error: failed to create results dir: %v\n", err) + continue + } + if err := util.WriteToFile(data, file); err != nil { + fmt.Printf("Error: failed to write CPU data: %v\n", err) + }test/e2e/testsuite/testsuite.go (1)
1346-1355: Return error from getTimestamp; fix printf and check WriteToFile (matches prior review).Currently errors are dropped and fmt.Println uses a formatting verb.
Apply:
-func getTimestamp(content string) { +func getTimestamp(content string) error { data := fmt.Sprintf("[%s], %s\n", time.Now().Format("15:04:05"), content) wd, err := os.Getwd() if err != nil { - fmt.Println("failed to get working directory: %v", err) + return fmt.Errorf("failed to get working directory: %v", err) } file := filepath.Join(wd, "../test-results/time-stamp.txt") - util.WriteToFile(data, file) + if err := os.MkdirAll(filepath.Dir(file), 0o755); err != nil { + return fmt.Errorf("failed to create results dir: %v", err) + } + return util.WriteToFile(data, file) }Also update callers to handle the returned error (see Before hook change above).
🧹 Nitpick comments (2)
test/e2e/testsuite/performance.go (1)
38-51: Minor logging/comment polish.Add newline to Printf; fix typos in comments (“start goroutine”, “cancel”, “canceled”). No behavior change.
- fmt.Printf("Attempt to start CPU collector, interval: %s", m.interval) + fmt.Printf("Attempt to start CPU collector, interval: %s\n", m.interval) - // create a context.WithCancel + // create a context.WithCancel - // stary goroutine + // start goroutinetest/e2e/testsuite/testsuite.go (1)
399-405: Nice fix on Printf; consider removing the extra 5s sleep.Stop waits for the collector goroutine (wg.Wait), so an additional fixed 5s delay slows tests without benefit. If you want a guard, sleep for at most one collection interval.
- fmt.Printf("Collection has stopped. Wait for 5 seconds to confirm that the collection task will no longer output data\n") - time.Sleep(5 * time.Second) + // Optional: allow up to one interval for any buffered I/O to flush + // time.Sleep(1 * time.Second)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
test/e2e/features/story_microshift.feature(3 hunks)test/e2e/features/story_openshift.feature(2 hunks)test/e2e/testsuite/performance.go(1 hunks)test/e2e/testsuite/testsuite.go(5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- test/e2e/features/story_openshift.feature
🧰 Additional context used
🧬 Code graph analysis (2)
test/e2e/testsuite/performance.go (1)
test/extended/util/fileops.go (1)
WriteToFile(93-108)
test/e2e/testsuite/testsuite.go (2)
test/e2e/testsuite/performance.go (1)
NewMonitor(24-28)test/extended/util/fileops.go (1)
WriteToFile(93-108)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (16)
- GitHub Check: build (windows-2022, 1.24)
- GitHub Check: Run OKD bundle with crc (1.24)
- GitHub Check: build (ubuntu-latest, 1.24)
- GitHub Check: Konflux kflux-prd-rh02 / crc-binary-on-pull-request
- GitHub Check: build-qe (windows, amd64)
- GitHub Check: build-qe (darwin, amd64)
- GitHub Check: build (ubuntu-22.04, 1.24)
- GitHub Check: build-qe (darwin, arm64)
- GitHub Check: build-qe (linux, arm64)
- GitHub Check: build-qe (linux, amd64)
- GitHub Check: build (ubuntu-latest, 1.24)
- GitHub Check: build (macOS-14, 1.24)
- GitHub Check: build (macOS-13, 1.24)
- GitHub Check: build (macOS-14, 1.24)
- GitHub Check: build-installer (windows-2022, 1.24)
- GitHub Check: verify-devcontainer
🔇 Additional comments (3)
test/e2e/features/story_microshift.feature (1)
11-11: Timestamp step addition looks good.test/e2e/testsuite/testsuite.go (2)
171-171: Monitor instance scope: confirm no parallel scenarios.Single monitor captured by closures; parallel execution would interleave Start/Stop. If parallelism is possible, create one per scenario (in Before) or guard with a package-level singleton and a refcount.
587-589: Step should propagate failures from timestamp writer.After updating getTimestamp to return error (see below), keeping this binding will surface I/O issues to Godog.
776af4f to
7bcaf7c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
test/e2e/testsuite/testsuite.go (1)
1350-1359: Return error immediately when os.Getwd() fails.When
os.Getwd()fails (line 1354), the code prints an error but continues to use the invalid working directory to construct the file path. This will causeutil.WriteToFileto fail. Instead, return the error immediately, consistent with howgetMemoryDatahandles the same error (lines 1342-1345).Apply this diff:
func getTimestamp(content string) error { data := fmt.Sprintf("[%s], %s\n", time.Now().Format("15:04:05"), content) wd, err := os.Getwd() if err != nil { - fmt.Printf("failed to get working directory: %v\n", err) + return fmt.Errorf("failed to get working directory: %v", err) } file := filepath.Join(wd, "../test-results/time-stamp.txt") return util.WriteToFile(data, file) }test/e2e/testsuite/performance.go (1)
95-104: Handle os.Getwd() error properly.When
os.Getwd()fails (line 98), the code continues with an invalid path, which will causeutil.WriteToFileto fail. Instead of ignoring this error, handle it and continue to the next iteration.Apply this diff:
if len(totalPercent) > 0 { data := fmt.Sprintf("[%s], cpu percent: %.2f%%\n", time.Now().Format("15:04:05"), totalPercent[0]) - wd, _ := os.Getwd() + wd, err := os.Getwd() + if err != nil { + fmt.Printf("Error: failed to get working directory: %v\n", err) + continue + } file := filepath.Join(wd, "../test-results/cpu-consume.txt") err := util.WriteToFile(data, file) if err != nil { fmt.Printf("Error: fail to write to %s: %v", file, err) } }
🧹 Nitpick comments (7)
test/e2e/testsuite/performance.go (4)
37-37: Add trailing newline to log message.The log message on line 37 is missing a trailing newline, which will cause the next log output to appear on the same line.
Apply this diff:
- fmt.Printf("Attempt to start CPU collector, interval: %s", m.interval) + fmt.Printf("Attempt to start CPU collector, interval: %s\n", m.interval)
52-67: Release mutex before calling cancelFunc and wg.Wait().While the current implementation is safe, it's better practice to release the mutex before calling
cancelFunc()andwg.Wait(). Holding the mutex during these blocking operations is unnecessary and could cause issues if the cancellation triggers any callbacks that need the mutex.Apply this diff:
func (m *Monitor) Stop() error { m.mu.Lock() - defer m.mu.Unlock() if !m.isRunning { + m.mu.Unlock() return fmt.Errorf("The collector is not running") } + cancel := m.cancelFunc + m.isRunning = false + m.mu.Unlock() + if m.cancelFunc != nil { - m.cancelFunc() + cancel() } - m.isRunning = false m.wg.Wait() fmt.Println("CPU collector has sent a stop signal") - // may need wait a while to stop return nil }
76-76: Fix typo in comment."cancled" should be "cancelled".
Apply this diff:
- // 1. check Context whether be cancled + // 1. check Context whether be cancelled
105-117: Remove or document commented code.The large block of commented-out memory collection code should either be removed if it's no longer needed, or documented with a TODO/NOTE comment explaining why it's kept for future use.
test/e2e/testsuite/testsuite.go (3)
267-272: Consider recording a start timestamp.Based on past review comments, it was suggested to record a "start" timestamp when entering the
@performancescenario. Currently, only a "finish" timestamp is recorded in the After hook (line 400). Recording both start and finish timestamps would provide complete timing data for performance analysis.Apply this diff:
if tag.Name == "@performance" { - + if err := getTimestamp("start"); err != nil { + fmt.Printf("Failed to record start timestamp: %v\n", err) + } if err := monitor.Start(); err != nil { fmt.Printf("Failed to start monitor: %v\n", err) } }
400-403: Improve error message wording.The error message "Failed to get finish getTimestamp" is awkwardly phrased. Consider rephrasing to "Failed to record finish timestamp" for clarity.
Apply this diff:
err := getTimestamp("finish") if err != nil { - fmt.Printf("Failed to get finish getTimestamp: %v\n", err) + fmt.Printf("Failed to record finish timestamp: %v\n", err) }
407-407: Use fmt.Println instead of fmt.Printf for plain strings.When printing a plain string without format specifiers, use
fmt.Printlninstead offmt.Printffor consistency and clarity.Apply this diff:
- fmt.Printf("Collection has stopped. Wait for 5 seconds to confirm that the collection task will no longer output data\n") + fmt.Println("Collection has stopped. Wait for 5 seconds to confirm that the collection task will no longer output data")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
test/e2e/features/story_microshift.feature(3 hunks)test/e2e/features/story_openshift.feature(2 hunks)test/e2e/testsuite/performance.go(1 hunks)test/e2e/testsuite/testsuite.go(5 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- test/e2e/features/story_microshift.feature
- test/e2e/features/story_openshift.feature
🧰 Additional context used
🧬 Code graph analysis (2)
test/e2e/testsuite/performance.go (1)
test/extended/util/fileops.go (1)
WriteToFile(93-108)
test/e2e/testsuite/testsuite.go (2)
test/e2e/testsuite/performance.go (1)
NewMonitor(23-27)test/extended/util/fileops.go (1)
WriteToFile(93-108)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
- GitHub Check: verify-devcontainer
- GitHub Check: build (ubuntu-latest, 1.24)
- GitHub Check: build-qe (linux, amd64)
- GitHub Check: build-qe (linux, arm64)
- GitHub Check: build-qe (windows, amd64)
- GitHub Check: build-qe (darwin, arm64)
- GitHub Check: build-qe (darwin, amd64)
- GitHub Check: build-installer (windows-2022, 1.24)
- GitHub Check: Run OKD bundle with crc (1.24)
- GitHub Check: build (macOS-14, 1.24)
- GitHub Check: build (macOS-13, 1.24)
- GitHub Check: build (windows-2022, 1.24)
- GitHub Check: build (windows-2022, 1.24)
- GitHub Check: build (macOS-14, 1.24)
- GitHub Check: build (ubuntu-22.04, 1.24)
- GitHub Check: build (ubuntu-latest, 1.24)
- GitHub Check: build (macOS-13, 1.24)
- GitHub Check: Konflux kflux-prd-rh02 / crc-binary-on-pull-request
e725ad3 to
f9dd334
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
test/e2e/testsuite/testsuite.go (1)
1354-1363: Critical: Return immediately when os.Getwd() fails.If
os.Getwd()fails at line 1358, the error is logged but the function continues to construct a file path with a potentially emptywdvariable. This leads to an invalid path being passed toutil.WriteToFile, and the root cause error is lost, making debugging difficult.Apply this diff to return the error immediately:
func getTimestamp(content string) error { data := fmt.Sprintf("[%s], %s\n", time.Now().Format("15:04:05"), content) wd, err := os.Getwd() if err != nil { - fmt.Printf("failed to get working directory: %v\n", err) + return fmt.Errorf("failed to get working directory: %v", err) } file := filepath.Join(wd, "../test-results/time-stamp.txt") return util.WriteToFile(data, file) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
test/e2e/features/story_microshift.feature(2 hunks)test/e2e/features/story_openshift.feature(2 hunks)test/e2e/testsuite/performance.go(1 hunks)test/e2e/testsuite/testsuite.go(5 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- test/e2e/testsuite/performance.go
- test/e2e/features/story_openshift.feature
🧰 Additional context used
🧬 Code graph analysis (1)
test/e2e/testsuite/testsuite.go (2)
test/e2e/testsuite/performance.go (1)
NewMonitor(23-27)test/extended/util/fileops.go (1)
WriteToFile(93-108)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)
- GitHub Check: verify-devcontainer
- GitHub Check: build (ubuntu-latest, 1.24)
- GitHub Check: build (windows-2022, 1.24)
- GitHub Check: build (macOS-14, 1.24)
- GitHub Check: build (macOS-13, 1.24)
- GitHub Check: build (macOS-13, 1.24)
- GitHub Check: build-qe (linux, arm64)
- GitHub Check: build (ubuntu-22.04, 1.24)
- GitHub Check: build (macOS-14, 1.24)
- GitHub Check: build (ubuntu-latest, 1.24)
- GitHub Check: Run OKD bundle with crc (1.24)
- GitHub Check: build-qe (linux, amd64)
- GitHub Check: build-qe (windows, amd64)
- GitHub Check: build-qe (darwin, arm64)
- GitHub Check: build-qe (darwin, amd64)
- GitHub Check: build-installer (windows-2022, 1.24)
- GitHub Check: Konflux kflux-prd-rh02 / crc-binary-on-pull-request
🔇 Additional comments (4)
test/e2e/features/story_microshift.feature (1)
20-43: LGTM! Timestamp recording steps are well-placed.The addition of the
@performancetag and timestamp recording steps aligns with the PR's objective to implement timestamp-based performance monitoring. The placement of timestamps around deployment, stop, and restart operations is logical and will enable performance analysis of these critical operations.test/e2e/testsuite/testsuite.go (3)
171-171: LGTM! Monitor initialization is appropriate.The monitor is correctly initialized with a 1-second sampling interval and scoped to the scenario lifecycle.
403-413: LGTM! Performance monitoring teardown is correctly implemented.The cleanup sequence properly records the finish timestamp, stops the monitor with appropriate error handling, and includes a verification delay.
595-596: LGTM! Step binding is correctly defined.The regex pattern and function binding are appropriate for capturing timestamp labels from the feature files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
test/e2e/features/story_microshift.feature (1)
44-48: Consider adding memory data collection after restart for consistency.Memory data is collected at key checkpoints: "After start" (line 11), "After deployment" (line 37), and "After stop" (line 42). However, there's no memory data collection after the restart on line 44. For comprehensive performance monitoring and consistency, consider adding a memory checkpoint after the CRC restart.
Apply this diff to add a memory checkpoint after restart:
And record timestamp "start again" And starting CRC with default bundle succeeds + And get memory data "After restart" And checking that CRC is running And with up to "4" retries with wait period of "1m" http response from "http://httpd-example-testproj.apps.crc.testing" has status code "200"
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
test/e2e/features/story_microshift.feature(2 hunks)test/e2e/features/story_openshift.feature(2 hunks)test/e2e/testsuite/performance.go(1 hunks)test/e2e/testsuite/testsuite.go(5 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- test/e2e/testsuite/performance.go
- test/e2e/testsuite/testsuite.go
- test/e2e/features/story_openshift.feature
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)
- GitHub Check: build-qe (linux, arm64)
- GitHub Check: build-qe (darwin, arm64)
- GitHub Check: build-qe (linux, amd64)
- GitHub Check: build-qe (windows, amd64)
- GitHub Check: build-qe (darwin, amd64)
- GitHub Check: build-installer (windows-2022, 1.24)
- GitHub Check: Run OKD bundle with crc (1.24)
- GitHub Check: build (ubuntu-22.04, 1.24)
- GitHub Check: build (ubuntu-latest, 1.24)
- GitHub Check: build (macOS-14, 1.24)
- GitHub Check: build (ubuntu-latest, 1.24)
- GitHub Check: verify-devcontainer
- GitHub Check: build (windows-2022, 1.24)
- GitHub Check: build (windows-2022, 1.24)
- GitHub Check: build (macOS-14, 1.24)
- GitHub Check: build (macOS-13, 1.24)
- GitHub Check: Konflux kflux-prd-rh02 / crc-binary-on-pull-request
🔇 Additional comments (2)
test/e2e/features/story_microshift.feature (2)
20-20: LGTM:@performancetag added for CPU monitoring.The addition of the
@performancetag aligns with the PR objective to monitor CPU usage in E2E tests. This tag triggers the Monitor lifecycle to sample CPU data periodically during scenario execution.
22-22: No issues found. The timestamp labels are correctly handled.The
getTimestampfunction implementation (testsuite.go:1357-1366) shows that the timestamp label parameter is only used for string formatting when writing to a fixed log file (../test-results/time-stamp.txt). Spaces in labels like"start again"are handled correctly—they appear as plain text in the timestamp log without any adverse effects. The label is not used in file naming, identifier derivation, or path operations where spaces might cause issues.Likely an incorrect or invalid review comment.
d403855 to
5ae7224
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
.github/workflows/windows-e2e.yml(1 hunks).github/workflows/windows-qe-tpl.yml(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- .github/workflows/windows-qe-tpl.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (16)
- GitHub Check: build-qe (linux, arm64)
- GitHub Check: build-qe (darwin, arm64)
- GitHub Check: build-qe (windows, amd64)
- GitHub Check: build-qe (darwin, amd64)
- GitHub Check: build-qe (linux, amd64)
- GitHub Check: build (macOS-14, 1.24)
- GitHub Check: build (macOS-13, 1.24)
- GitHub Check: build (windows-2022, 1.24)
- GitHub Check: build (ubuntu-latest, 1.24)
- GitHub Check: build (macOS-14, 1.24)
- GitHub Check: build (macOS-13, 1.24)
- GitHub Check: build (ubuntu-latest, 1.24)
- GitHub Check: verify-devcontainer
- GitHub Check: build (ubuntu-22.04, 1.24)
- GitHub Check: Run OKD bundle with crc (1.24)
- GitHub Check: Konflux kflux-prd-rh02 / crc-binary-on-pull-request
🔇 Additional comments (1)
.github/workflows/windows-e2e.yml (1)
12-14: Verify: Expanding workflow trigger to all events (not just PRs).The conditional logic was changed to remove the
github.event.workflow_run.event == 'pull_request'check. This means the workflow now runs on all workflow_run completion events, including direct pushes to branches and other non-PR events.Confirm this is intentional, as it may:
- Increase CI costs by running more frequently
- Execute in contexts where PR metadata may not be available
- Require validation that the template correctly handles non-PR scenarios
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
.github/workflows/windows-e2e.yml (1)
15-15: Regression: Workflow template still references personal fork instead of official repository.This is a regression of a previous critical issue. Line 15 references
lilyLuLiu/crc/, which breaks reproducibility and prevents other contributors from running this workflow without access to the personal fork.Apply this diff to reference the official repository:
- uses: lilyLuLiu/crc/.github/workflows/windows-qe-tpl.yml@main + uses: crc-org/crc/.github/workflows/windows-qe-tpl.yml@mainEnsure
windows-qe-tpl.ymlis committed tocrc-org/crcbefore merging this PR.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
.github/workflows/windows-e2e.yml(1 hunks).github/workflows/windows-qe-tpl.yml(3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (16)
- GitHub Check: build-qe (linux, arm64)
- GitHub Check: build-qe (darwin, amd64)
- GitHub Check: build-qe (windows, amd64)
- GitHub Check: build-qe (linux, amd64)
- GitHub Check: build (macOS-14, 1.24)
- GitHub Check: build (ubuntu-22.04, 1.24)
- GitHub Check: build-installer (windows-2022, 1.24)
- GitHub Check: build (ubuntu-latest, 1.24)
- GitHub Check: build (macOS-14, 1.24)
- GitHub Check: build (macOS-13, 1.24)
- GitHub Check: build (macOS-13, 1.24)
- GitHub Check: build (windows-2022, 1.24)
- GitHub Check: build (ubuntu-latest, 1.24)
- GitHub Check: Run OKD bundle with crc (1.24)
- GitHub Check: verify-devcontainer
- GitHub Check: Konflux kflux-prd-rh02 / crc-binary-on-pull-request
🔇 Additional comments (2)
.github/workflows/windows-qe-tpl.yml (2)
6-18: Workflow dispatch trigger is well-structured.The
workflow_dispatchinputs are clearly defined, properly typed, and mirror theworkflow_callinputs below. This enables flexible manual triggering for testing and debugging.
129-131: Remove debug commands (ls -lh,whoami) that add unnecessary output to CI logs.Lines 129–130 contain diagnostic commands that appear to be troubleshooting artifacts and should be removed. They serve no documented purpose in the workflow.
Line 131's
sudo chown runner:runner host username id_rsais necessary and correct—it ensures the runner user can read the connection details files created by the mapt container in the subsequent step ("Install CRC on host"), which reads these files via$(cat host)and$(cat username).
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
test/e2e/testsuite/testsuite.go (1)
267-279: Fix misleading error message.Line 273 reports "Failed to get finish getTimestamp" when recording the "start" timestamp, which will confuse debugging efforts.
Apply this diff:
if tag.Name == "@performance" { if err := monitor.Start(); err != nil { fmt.Printf("Failed to start monitor: %v\n", err) } err := getTimestamp("start") if err != nil { - fmt.Printf("Failed to get finish getTimestamp: %v\n", err) + fmt.Printf("Failed to record start timestamp: %v\n", err) } err = getMemoryData("Before start") if err != nil { fmt.Printf("Failed to get memory data: %v\n", err) } }
🧹 Nitpick comments (2)
test/e2e/testsuite/testsuite.go (1)
1357-1366: Suggest early return on os.Getwd error.While the current implementation will eventually return an error if
os.Getwd()fails (viaWriteToFilereturning an error), it's cleaner and more explicit to return immediately when the working directory cannot be determined.Apply this diff:
func getTimestamp(content string) error { data := fmt.Sprintf("[%s], %s\n", time.Now().Format("15:04:05"), content) wd, err := os.Getwd() if err != nil { - fmt.Printf("failed to get working directory: %v\n", err) + return fmt.Errorf("failed to get working directory: %v", err) } file := filepath.Join(wd, "../test-results/time-stamp.txt") return util.WriteToFile(data, file) }test/e2e/testsuite/performance.go (1)
109-122: Optional: Consider removing commented-out code.The commented memory collection code suggests it was evaluated but not needed (since
getMemoryDatahandles memory sampling at specific test points). Consider removing it to reduce clutter, or add a comment explaining why it's preserved.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
test/e2e/features/story_microshift.feature(2 hunks)test/e2e/features/story_openshift.feature(2 hunks)test/e2e/testsuite/performance.go(1 hunks)test/e2e/testsuite/testsuite.go(5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- test/e2e/features/story_openshift.feature
🧰 Additional context used
🧬 Code graph analysis (2)
test/e2e/testsuite/performance.go (1)
test/extended/util/fileops.go (1)
WriteToFile(93-108)
test/e2e/testsuite/testsuite.go (2)
test/e2e/testsuite/performance.go (1)
NewMonitor(23-27)test/extended/util/fileops.go (1)
WriteToFile(93-108)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
- GitHub Check: build (macOS-13, 1.24)
- GitHub Check: build (macOS-14, 1.24)
- GitHub Check: build (ubuntu-latest, 1.24)
- GitHub Check: Run OKD bundle with crc (1.24)
- GitHub Check: build-qe (darwin, amd64)
- GitHub Check: build-qe (linux, amd64)
- GitHub Check: build-qe (windows, amd64)
- GitHub Check: build (ubuntu-22.04, 1.24)
- GitHub Check: build (macOS-13, 1.24)
- GitHub Check: build-qe (linux, arm64)
- GitHub Check: verify-devcontainer
- GitHub Check: build (macOS-14, 1.24)
- GitHub Check: build-installer (windows-2022, 1.24)
- GitHub Check: build (ubuntu-latest, 1.24)
- GitHub Check: build-qe (darwin, arm64)
- GitHub Check: build (windows-2022, 1.24)
- GitHub Check: build (windows-2022, 1.24)
- GitHub Check: Konflux kflux-prd-rh02 / crc-binary-on-pull-request
🔇 Additional comments (6)
test/e2e/features/story_microshift.feature (1)
20-48: LGTM! Performance monitoring integration is well-structured.The addition of the
@performancetag and timestamp recording steps at key points (deployment, stop, start again) aligns well with the new background monitoring approach. The removal of inline CPU data checks in favor of continuous monitoring is a cleaner design.test/e2e/testsuite/testsuite.go (3)
171-171: LGTM! Appropriate sampling interval.The 1-second interval for CPU monitoring strikes a good balance between data granularity and overhead.
406-416: LGTM! Proper cleanup sequence.The After hook correctly records the finish timestamp, stops the monitor, and includes a reasonable grace period for the collector goroutine to complete.
598-599: LGTM! Step binding is correct.The timestamp recording step is properly bound to the
getTimestampfunction.test/e2e/testsuite/performance.go (2)
15-27: LGTM! Well-designed Monitor structure.The Monitor type includes appropriate synchronization primitives (mutex, wait group) and lifecycle management (context cancellation). The constructor correctly initializes only the interval, relying on zero values for other fields.
52-67: LGTM! Stop method is properly synchronized.The Stop method correctly uses mutex protection and ensures the collector goroutine has fully terminated before returning. While calling
wg.Wait()under the lock is somewhat unusual, it's safe here sincecollectLoopdoesn't acquire the mutex, and it prevents races between Stop and subsequent Start calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
test/e2e/testsuite/testsuite.go (2)
267-279: Fix misleading error message for start timestamp.Line 273 reports "Failed to get finish getTimestamp" when recording the "start" timestamp, making debugging confusing. This was flagged in previous reviews but not fully corrected.
Apply this diff:
err := getTimestamp("start") if err != nil { - fmt.Printf("Failed to get finish getTimestamp: %v\n", err) + fmt.Printf("Failed to record start timestamp: %v\n", err) }
1357-1366: Critical: Handle os.Getwd error properly to prevent invalid file paths.If
os.Getwd()fails at line 1361, the function prints an error but continues executing with an emptywdstring, resulting in an invalid file path at line 1364. This can cause silent failures or write data to unexpected locations.Apply this diff:
func getTimestamp(content string) error { data := fmt.Sprintf("[%s], %s\n", time.Now().Format("15:04:05"), content) wd, err := os.Getwd() if err != nil { - fmt.Printf("failed to get working directory: %v\n", err) + return fmt.Errorf("failed to get working directory: %v", err) } file := filepath.Join(wd, "../test-results/time-stamp.txt") return util.WriteToFile(data, file) }
🧹 Nitpick comments (3)
test/e2e/testsuite/performance.go (3)
44-44: Fix typo in comment."stary" should be "start".
- // stary goroutine + // start goroutine m.wg.Add(1)
76-76: Fix typo in comment."cancled" should be "cancelled".
- // 1. check Context whether be cancled + // 1. check Context whether be cancelled select {
52-67: Consider releasing mutex before waiting on WaitGroup.Lines 53-63: The current implementation calls
m.wg.Wait()while holding the mutex lock. While this is safe becausecollectLoopdoesn't acquire the mutex, releasing the lock before waiting is generally better practice and would allow concurrent calls to Stop() to fail-fast rather than block.func (m *Monitor) Stop() error { m.mu.Lock() - defer m.mu.Unlock() if !m.isRunning { + m.mu.Unlock() return fmt.Errorf("The collector is not running") } + + // Copy cancelFunc and update state while holding lock + cancelFunc := m.cancelFunc + m.isRunning = false + m.mu.Unlock() + + // Call cancel and wait without holding the lock - if m.cancelFunc != nil { - m.cancelFunc() + if cancelFunc != nil { + cancelFunc() } - m.isRunning = false m.wg.Wait() + fmt.Println("CPU collector has sent a stop signal") - // may need wait a while to stop return nil }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
test/e2e/features/story_microshift.feature(2 hunks)test/e2e/features/story_openshift.feature(2 hunks)test/e2e/testsuite/performance.go(1 hunks)test/e2e/testsuite/testsuite.go(6 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- test/e2e/features/story_microshift.feature
🧰 Additional context used
🧬 Code graph analysis (2)
test/e2e/testsuite/performance.go (1)
test/extended/util/fileops.go (1)
WriteToFile(93-108)
test/e2e/testsuite/testsuite.go (2)
test/e2e/testsuite/performance.go (1)
NewMonitor(23-27)test/extended/util/fileops.go (1)
WriteToFile(93-108)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: build (windows-2022, 1.24)
- GitHub Check: build (macOS-14, 1.24)
- GitHub Check: build (macOS-13, 1.24)
- GitHub Check: Konflux kflux-prd-rh02 / crc-binary-on-pull-request
🔇 Additional comments (5)
test/e2e/features/story_openshift.feature (1)
12-41: LGTM! Timestamp-based monitoring approach is well-integrated.The addition of the
@performancetag and replacement of CPU data checks with timestamp recording steps aligns well with the new background Monitor approach. Memory data collection is appropriately retained at key scenario points.test/e2e/testsuite/testsuite.go (2)
171-171: LGTM! Monitor initialization is appropriate.Creating the monitor with a 1-second sampling interval provides good granularity for CPU usage tracking in e2e tests.
598-601: LGTM! Step binding correctly updated.The step binding change from CPU data collection to timestamp recording aligns with the new monitoring architecture.
test/e2e/testsuite/performance.go (2)
15-67: LGTM! Monitor lifecycle management is well-implemented.The mutex-protected state transitions, context cancellation, and WaitGroup coordination provide proper control over the background collector goroutine. Error handling for invalid state transitions (starting when running, stopping when not running) is appropriate.
69-123: LGTM! CPU data collection loop is robust.The collectLoop properly handles context cancellation, collection errors, and I/O errors. The use of
cpu.Percent(calcInterval, false)with the interval as the measurement duration eliminates the need for explicit sleep calls, and error recovery with 1-second retry delays prevents tight error loops.
|
@lilyLuLiu: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
| fmt.Printf("Collection has stopped. Wait for 5 seconds to confirm that the collection task will no longer output data\n") | ||
| time.Sleep(5 * time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if possible, better avoid any explicit (long) Sleep
maybe something like
while ! monitor.Finished():
sleep 0.1
or even, assume it will finish safely and just skip the sleep?
| fmt.Printf("Error: failed to get working directory: %v\n", err) | ||
| continue | ||
| } | ||
| file := filepath.Join(wd, "../test-results/cpu-consume.txt") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering about this hardcoded path. Won't new calls overwrite the old data?
or add a comment to say that this is safe, because the current directory is guaranteed to be clean
(and I'm surprized by the initial ../, I usually prefer when everything is stored inside the same directory (but it again depends on how . is controlled))
| calcInterval := m.interval | ||
|
|
||
| for { | ||
| // 1. check Context whether be cancled |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo
| m.cancelFunc = cancel | ||
| m.isRunning = true | ||
|
|
||
| // stary goroutine |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo
| /* | ||
| vMem, err := mem.VirtualMemory() | ||
| if err != nil { | ||
| fmt.Printf("Error: failed to collect memory data: %v", err) | ||
| continue | ||
| } | ||
| memoryused := vMem.Used / 1024 / 1024 | ||
| data := fmt.Sprintf("[%s], MemoryUsed(MB): %d\n" , | ||
| time.Now().Format("15:04:05"), memoryused) | ||
| wd, _ := os.Getwd() | ||
| file := filepath.Join(wd, "../test-results/memory-consume.txt") | ||
| util.WriteToFile(data, file) | ||
| */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this comment should be removed, it's not the write place to measure the memory usage,
this is after the blocking call to cpu.Percent(calcInterval, false), and I think there's usually no need to collect the memory as often as the CPU usage
Description
Fixes: #N
Relates to: #N, PR #N, ...
Type of change
test, version modification, documentation, etc.)
Proposed changes
Testing
Contribution Checklist
Summary by CodeRabbit
Tests
Chores