Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/tests/sunshine: fix test #342813

Closed
wants to merge 1 commit into from

Conversation

ghpzin
Copy link
Contributor

@ghpzin ghpzin commented Sep 18, 2024

  • change text in wait_for_console_text to prevent sending curl with pairing response before moonlight can accept it
  • remove "close moonlight" step as next window stays on top in fullscreen, previous window with pairing result does not interfere with test. Otherwise it needs OCR or some other way to only do send_key("kp_enter") after "Pairing complete" window is visible.
  • add different ways to increase chances of successfull OCR: hide icewm panel, gxmessage window takes full screen without titlebar or buttons, black background, white foreground, consolas font,"ABC" text

Description of changes

Fixes nixosTests.sunshine timing out after 1 hour:
https://hydra.nixos.org/job/nixos/trunk-combined/nixos.tests.sunshine.x86_64-linux
https://hydra.nixos.org/build/272715035
Error log:

(finished: must succeed: curl --insecure -u sunshine:sunshine -d '{"pin": "1234"}' https://localhost:47990/api/pin, in 0.08 seconds)
moonlight: sending key 'kp_enter'
(finished: sending key 'kp_enter', in 0.01 seconds)
moonlight: waiting for hello world to appear on screen
...
moonlight # [ 2497.757190] hrtimer: interrupt took 6012496 ns
timeout reached; test terminating...

curl with pin may execute before moonlight starts accepting it, so there is never a successful pairing.
Changed wait_for_console_text so that does not happen, there is another Executing message that is shown before pairing begins, but it does not have pair word in it:

moonlight # 00:00:05 - Qt Info: Executing request: "http://sunshine:47989/serverinfo?uniqueid=0123456789ABCDEF&uuid=6604d450bc5a442c8d6a2c863d94487e"

As there is a decent control over what is shown to be recognized with OCR added things to make it less flaky (removed other text on screen, changed colors, font and text).

Listed test maintainer:
@devusb

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 24.11 Release Notes (or backporting 23.11 and 24.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

@github-actions github-actions bot added the 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS label Sep 18, 2024
@ghpzin
Copy link
Contributor Author

ghpzin commented Sep 18, 2024

@ofborg test sunshine

@ghpzin ghpzin marked this pull request as draft September 18, 2024 14:39
@ghpzin
Copy link
Contributor Author

ghpzin commented Sep 18, 2024

Something is still wrong with pairing even after this change:
https://logs.ofborg.org/?attempt_id=c883962c-ad3b-4107-9919-089a2d72670a&key=nixos%2Fnixpkgs.342813

(finished: must succeed: curl --fail --insecure -u sunshine:sunshine -d '{"pin": "1234"}' https://localhost:47990/api/pin, in 0.16 seconds)
moonlight: waiting for Executing request.*phrase=pairchallenge to appear on console
moonlight # 00:00:04 - Qt Critical: Incorrect PIN

@ofborg ofborg bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 0 This PR does not cause any packages to rebuild on Linux labels Sep 18, 2024
@devusb
Copy link
Contributor

devusb commented Sep 22, 2024

@ghpzin does this work for you locally as-is? for me to get this to complete, I had to add one additional check to try to ensure that the screen sharing session had begun before looking for ABC -- I did the below but it's probably not the only choice.

@@ -68,6 +68,7 @@ import ./make-test-python.nix ({ pkgs, lib, ... }: {
 
     # connect to sunshine from moonlight and look for the words
     moonlight.execute("moonlight --video-decoder software stream sunshine 'Desktop' >&2 & disown")
+    moonlight.wait_for_console_text("Dropping window event during flush")
     moonlight.wait_for_text("ABC")
   '';
 })

If I add this the test completes for me in ~25 seconds -- if not, it generally still completes, but takes 5 minutes.

Even with this change I've still gotten Incorrect PIN at least once though, but it's not consistent -- I can run it 10 times and have no failures. Wonder if it's a timing issue -- maybe adding a sleep in between moonlight.wait_for_console_text("Executing request.*pair") and sunshine.succeed("curl --fail --insecure -u sunshine:sunshine -d '{\"pin\": \"1234\"}' https://localhost:47990/api/pin") would help? Wish the failure was reproducible though.

Thanks for taking a look at this -- hadn't caught the failure yet.

@ghpzin
Copy link
Contributor Author

ghpzin commented Sep 22, 2024

It does complete, but slowly for me too.
Adding wait_for_console_text to only start doing OCR after window is visible might be a good idea. Will add it in a bit, thank you.
I assume without it just 1 iteration of tesseract with no window visible might block for minutes (my laptop is not that beefy, so I do not have a point to compare how slow/fast it is).


Still can't track down why Incorrect PIN happens either.
Iirc already tried with sunshine.sleep(10) between commands and got that error once.
Will try adding even more time, sleep for both machines or in different places.

I have run test locally with some modification (to fail early) in a loop 132 times and 9 of them had this error, did not manage to find anything in logs indicating why it happens so far.
Just that it seems to come from moonlight-qt here:
https://github.com/moonlight-stream/moonlight-qt/blob/17a25484805ef927ad7c1b2906f885ee26eda928/app/backend/nvpairingmanager.cpp#L319
Might patch it to add some debug prints if nothing else helps.

Thought maybe curl somehow returns something in response with error, but still responds with non-error status-code and current test does not show what it responds with.
But curl response looks the same in both cases - {"status":"true"}.

Error:

(finished: waiting for Executing request.*pair to appear on console, in 15.96 seconds)
sunshine #   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
sunshine #                                  Dload  Upload   Total   Spent    Left  Speed
moonlight # 00:00:14 - Qt Info: Executing request: "http://sunshine:47989/pair?uniqueid=0123456789ABCDEF&u
uid=bdcc558d4d0549769b9c5216460dc66b&devicename=roth&updateState=1&clientchallenge=a04b1311112d3e09dbe603d
6a33b8f65"
sunshine #   0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0100    40  100
 25  100    15    357    214 --:--:-- --:--:-- --:--:--   597
sunshine: (0, '{\n    "status": "true"\n}\n')
moonlight # 00:00:14 - Qt Info: Executing request: "http://sunshine:47989/pair?uniqueid=0123456789ABCDEF&u
uid=a3e6fba885cf4a5eb967d34abe53df2f&devicename=roth&updateState=1&serverchallengeresp=eb6ae13ba0a6240f839
3b1083aa35bb0d71ef9cda764e6eb752e863e5698d40d"
moonlight # 00:00:14 - Qt Critical: Incorrect PIN
moonlight # 00:00:14 - Qt Info: Executing request: "http://sunshine:47989/unpair?uniqueid=0123456789ABCDEF&uuid=90b2b839342140a2b13974cfe2d54752"

No error:

(finished: waiting for Executing request.*pair to appear on console, in 22.86 seconds)
sunshine #   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
sunshine #                                  Dload  Upload   Total   Spent    Left  Speed
moonlight # 00:00:20 - Qt Info: Executing request: "http://sunshine:47989/pair?uniqueid=0123456789ABCDEF&u
uid=936d787ec7f343babc111c98f1858099&devicename=roth&updateState=1&clientchallenge=e95483702fe65e2d9c396f2
5af921874"
sunshine #   0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0100    40  100
 25  100    15    329    197 --:--:-- --:--:-- --:--:--   547
sunshine: (0, '{\n    "status": "true"\n}\n')
moonlight # 00:00:20 - Qt Info: Executing request: "http://sunshine:47989/pair?uniqueid=0123456789ABCDEF&u
uid=6d6306b9c6af412193dc50a6f320d264&devicename=roth&updateState=1&serverchallengeresp=b88e4fdde15bf3c6f3e
f7795f25e71be5f64e793313f21d32a8332eb57d0b649"
moonlight # 00:00:20 - Qt Info: Executing request: "http://sunshine:47989/pair?uniqueid=0123456789ABCDEF&u
uid=6a986f25918c4d869186e3c912982290&devicename=roth&updateState=1&clientpairingsecret=4cad44d6acf1c7006ce
950b28e52312916ed3737f1c8411b6eabde9bb7fdf4c513245a773fcf746dc82701acb0ae706871e752fef4db9c7d08635bb5f930d
5c531e6f19b9228d729ed6008722e1c22b78ccdefe7cb7be66a398b39c601877cc6a5c2c78b9b84603680de9b091f1dc7fc3de960b
fc7549fa38487a753fd9b839dc7516b07c84100d3e4bd31afa01c48d72bff08db9266cc800fc02507c275774e1c5e607fed0615d71
e2b567ca31525f83056734df611d6cdb00207e414f68ba27af533bddbfc41bd19a23c09b11ebcfaffab11d034bf5fc23cf9e198e88
4ae31715848ff38f8e0d982dee20e73486a49c079bba3219a9bb6fd8008441daf0f2b840b1de7ee7f163d2420b68d84d07ea9"
moonlight # 00:00:20 - Qt Info: Executing request: "https://sunshine:47984/pair?uniqueid=0123456789ABCDEF&uuid=1aae875fd9284534b9a9d252c2d7dbb7&devicename=roth&updateState=1&phrase=pairchallenge"

There might be something in full curl -v output too, I might try before patching moonlight-qt


PS also modifying test to catch that error is quite difficult, because seemingly adding timeout in wait_for_console_text breaks it (it never fires even if message appears in logs).

- change text in `wait_for_console_text` to prevent sending curl with
pairing response before moonlight can accept it
- remove "close moonlight" step as next window stays on top in
fullscreen, previous window with pairing result does not interfere
with test. Otherwise it needs OCR or some other way to only do
`send_key("kp_enter")` after "Pairing complete" window is visible.
- add different ways to increase chances of successfull OCR: hide icewm
panel, gxmessage window takes full screen without titlebar or buttons,
black background, white foreground, consolas font,"ABC" text
- add `wait_for_console_text` to only run OCR after window is visible
@ghpzin ghpzin force-pushed the nixosTests.sunshine/fix branch from ae958f5 to 7fae736 Compare September 22, 2024 21:20
@wegank wegank added the 2.status: merge conflict This PR has merge conflicts with the target branch label Dec 10, 2024
@devusb devusb mentioned this pull request Dec 17, 2024
13 tasks
@devusb
Copy link
Contributor

devusb commented Dec 17, 2024

@ghpzin I cherry-picked and rebased this commit into #364438 as it at least makes the tests mostly work -- still haven't been able to resolve the occasionally stuck problem, but figure that's better than nothing. let me know if you have any issues or anything you'd rather me do differently here.

@ghpzin
Copy link
Contributor Author

ghpzin commented Dec 17, 2024

Thank you, if it reduces flakiness it is all good.
Regarding pin issue I wanted to return to trying to diagnose it, but forgot about it half-way without any more clues as to the reason or reliable reproducer (still fails around 1/10 runs with that error).

@ghpzin ghpzin closed this Dec 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.status: merge conflict This PR has merge conflicts with the target branch 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 0 This PR does not cause any packages to rebuild on Linux
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants