Summary
Follow-up from the review discussion on #4892. The state-change goroutine in pkg/driver/wsl2/vm_windows.go has three related robustness issues that remain after the hot-loop fix was merged.
Issues
1. Cleanup runs with a canceled context
Both stopVM (vm_windows.go:59) and getWslStatus (vm_windows.go:208) pass ctx into executil.RunUTF16leCommand via WithContext(ctx). By the time the <-ctx.Done() arm runs, that context is already canceled, so the wsl.exe --list --verbose and wsl.exe --terminate invocations will likely fail immediately with context canceled.
The cleanup we kept the goroutine alive for may not actually execute. Using context.WithoutCancel(ctx) (or context.Background() with a fresh timeout) for the cleanup commands would be more robust.
2. getWslStatus errors are silently swallowed
The condition err == nil && status == StatusRunning means any error (including a context canceled error from issue 1) causes the terminate path to be skipped entirely. At minimum, a logrus.Warnf on the error would aid debugging.
3. _ = stopVM(...) discards the terminate error without logging
The return value of stopVM is discarded with _ and no log message is produced if the terminate fails. Adding a logrus.WithError(err).Warn(...) would match the logging style used elsewhere in the codebase.
Suggested fix
- Use
context.WithoutCancel(ctx) or context.Background() with a timeout for cleanup commands in the <-ctx.Done() path
- Log
getWslStatus errors at warn level instead of silently skipping
- Log
stopVM errors at warn level instead of discarding
cc @jandubois (per review feedback on #4892)
Summary
Follow-up from the review discussion on #4892. The state-change goroutine in
pkg/driver/wsl2/vm_windows.gohas three related robustness issues that remain after the hot-loop fix was merged.Issues
1. Cleanup runs with a canceled context
Both
stopVM(vm_windows.go:59) andgetWslStatus(vm_windows.go:208) passctxintoexecutil.RunUTF16leCommandviaWithContext(ctx). By the time the<-ctx.Done()arm runs, that context is already canceled, so thewsl.exe --list --verboseandwsl.exe --terminateinvocations will likely fail immediately withcontext canceled.The cleanup we kept the goroutine alive for may not actually execute. Using
context.WithoutCancel(ctx)(orcontext.Background()with a fresh timeout) for the cleanup commands would be more robust.2.
getWslStatuserrors are silently swallowedThe condition
err == nil && status == StatusRunningmeans any error (including acontext cancelederror from issue 1) causes the terminate path to be skipped entirely. At minimum, alogrus.Warnfon the error would aid debugging.3.
_ = stopVM(...)discards the terminate error without loggingThe return value of
stopVMis discarded with_and no log message is produced if the terminate fails. Adding alogrus.WithError(err).Warn(...)would match the logging style used elsewhere in the codebase.Suggested fix
context.WithoutCancel(ctx)orcontext.Background()with a timeout for cleanup commands in the<-ctx.Done()pathgetWslStatuserrors at warn level instead of silently skippingstopVMerrors at warn level instead of discardingcc @jandubois (per review feedback on #4892)