-
Notifications
You must be signed in to change notification settings - Fork 254
Use PkgBenchmark to detect performance regressions for heaps #531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The Travis CI benchmark failure is expected for this change because it's attempting to re-run the test against a master which does not have PkgBenchmark support. |
I have two more changes in the pipeline that I'll submit PRs for once this issue is accepted. If all these changes are combined into a single PR, then performance differences are obscured. milesfrain@b9ffc1b expands benchmarking coverage in anticipation of milesfrain#12 which tackles the following:
Updating the docs is still on the todo list, but figured it would be a good idea to discuss thoughts on the API change and deprecation first. The benchmarking results show improvements in many areas. A highlight is 2x gain on both execution time and memory for
Full logs available here, and you can reproduce on your own machine with:
An area for follow-up investigation is why these two results are not identical.
|
For reference, here's the regression if
|
@oxinabox What are the next steps for merging this? |
Lets merge it, can always fix it more later if e.g. the benchmarks break too often and we want to have them not block |
thanks! |
This compares benchmark results of future PRs against master (just for heaps at this point). Note that this new benchmarking feature will not be fully operational until merged. An example of what to expect is in my development branch.
The results section of the Travis CI log is where to look for regressions (excerpt below). There are no regressions in this case of updating the
readme
. An additional enhancement would be to automatically post a comment with this information to the PR (as is done with code coverage). Perhaps also mention the number non-overlaping baseline and target cases.A current limitation is that the
baseline
for comparison must be the repo'smaster
branch, which may lead to unexpected results for PRs to feature branches. Not sure if this is a common use case to be concerned about.This feature is heavily inspired by work done by @tkf and @ericphanson, and I'm wondering how should we handle licensing of the benchmarking files lifted from Transducers.jl.
We should also follow through on splitting-up DataStructures.jl as proposed in #310. This is more important with increased automated testing, since we don't need to duplicate testing in all functionally independent sections upon changes in only one of these sections. I could work on updating Heaps.jl and bring regression testing there instead.