-
Notifications
You must be signed in to change notification settings - Fork 1.2k
ci: use ubuntu-arm on every PR/push, use macos at night #15314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
rmuir
commented
Oct 9, 2025
- Gives us a 10 minute CI system
- Expand test coverage to "arm on linux", which is a real use-case
- Replaces slowest runner (20 minutes) with fastest runner (5 minutes)
- Still tests with macos-arm at night
- Policeman Jenkins still tests macos many times a day (and more reliably)
This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR. |
I don't think Uwe's mac vm is an equivalent to these gh runners. I'd like to believe gh mac runners are actually on apple hardware while Uwe's is a hackintosh. Not that we ever discovered anything interesting up until now - I say this because I distinctly remember those mac runners on gh being much faster than anything else. It could have been the number of available cores or maybe something has changed externally. |
@@ -0,0 +1,56 @@ | |||
name: "Run checks on MacOS: all modules" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can merge this back into the single run-check-all and have a conditional parameter stating which OSs you want to run on? The point of having this single workflow was to decrease the maintenance of those gradle parameters and checks. I don't mind pulling it out for the time being until the problem with macs is sorted out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah i feel bad copy-pasting the job, just didn't want strange behavior since I'm adding a cron
here. We can probably merge into a single workflow, but it would mean dragging the mac hacks back in too :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dweiss does this work better?
I agree, which is why I don't just drop the job, but instead try to reduce the frequency of it. These mac runners are a struggle and also have a high cost. I don't want to lose test coverage, but a nightly job would really take the pressure off, as opposed to invoking these on every PR/push. The ubuntu one gives us more realistic arm testing (IMO) for search engines running on server-side ARM, it is efficient, and that's a real use-case, e.g. deploying to the cloud on graviton CPUs and the like. It is a good one for every PR. |
I'll go as far as to say, I think we should change the "Run checks: all modules / checks without tests" to go on ubuntu-24.04-arm. The runner is simply faster than ubuntu-latest for our use-case. |
I tried running linter with the ARM, running a cached build, saves a few seconds vs the intel for the linting job (which is around 3-4 minutes): not worth the trouble. nice that all linters do work correctly with arm though! |
Gives us a 10 minute CI system Expand test coverage to "arm on linux", which is a real use-case Replaces slowest runner (20 minutes) with fastest runner (5 minutes) Still tests with macos-arm at night Policeman Jenkins still tests macos many times a day (and more reliably)
This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fancy - I like this.
These mac runners are sometimes quite fast, even approaching 6 minutes (latest But then sometimes it takes 20 minutes: like the two builds it ran on the PR before it was merged. The variation is crazy unpredictable. Maybe some kind of throttling or old hardware, I have not fully debugged it, nor am I sure I want to. This PR is trying to make the problem less annoying. |
we don't see the backstage of how it's organized - can be you're hitting a runner on a machine that is heavily loaded and it's not a fair vm scheduling system. or an older piece of hardware. it gets really complicated to trace it back to the root these days. |
When debugging I just dumped sysctls and it shows a "virtualized cpu" and list of features, nothing exciting. You can find some other complaints on github around the problem, no clear solution. My gut is that it is probably just not supported as well as other operating system choices, maybe less mature virtualization around it, maybe struggle to meet with load demands too. Also its a quirky OS that does strange things. |
This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR. |
@dweiss that's awesome: it fills in the missing context in the CI build logs. Thank you |