Skip to content

Conversation

@vanvoorden
Copy link

@vanvoorden vanvoorden commented Dec 29, 2025

There seems to be an accidental quadratic happening in the MixMax methods.1

We document our worst case performance to be O(n log n). That is our runtime performance if we sort our entire collection of values. Our threshold to sort everything is ten percent of the total count.

If we do not sort the entire collection our runtime is O(k log k + nk). The problem is that nk component. If k is ten percent of n this will still grow quadratically with large n and we will not be meeting our promise of O(n log n) worst case.

One option is to change the logic that computes our threshold. Instead of computing a threshold that scales linearly with n we can compute a threshold that scales logarithmically with n. This will then help us meet our promise of O(n log n) worst case.

The current implementation presented in this diff uses the underscored _binaryLogarithm method from Standard Library. If we wanted to avoid the underscored API we could also try something similar to what happens in RandomSample to switch over methods from Darwin and Numerics.2 Or we could just write our own here in this file.

Here are some benchmarks sampled locally on this new threshold:

import Algorithms
import Foundation

func Measure<T>(
  _ label: String,
  _ cycles: Int,
  setUp: () -> () = { },
  body: () -> (T),
  condition: (T) -> Bool = { _ in true },
  tearDown: () -> () = { },
) {
  var duration = Duration.nanoseconds(0)
  
  for _ in (1 ... cycles) {
    setUp()
    let clock = ContinuousClock()
    let instant = clock.now
    let result = body()
    duration += instant.duration(to: clock.now)
    precondition(condition(result))
    tearDown()
  }
  
  duration /= cycles
  
  let microseconds = duration.formatted(
    .units(
      allowed: [.microseconds],
      fractionalPart: .show(length: 3)
    )
  )
  print("\(label): \(microseconds)")
}

func Benchmark(
  n: Int,
  cycles: Int
) {
  let a = Array(1 ... n)
  let k = a.count._binaryLogarithm() - 1
  do {
    var temp = a
    Measure("Sort", cycles) {
      temp.shuffle()
    } body: {
      temp.sort()
      return temp[k]
    } condition: { result in
      result == a[k]
    }
  }
  do {
    var temp = a
    Measure("Min", cycles) {
      temp.shuffle()
    } body: {
      return temp.min(count: k + 1)[k]
    } condition: { result in
      result == a[k]
    }
  }
}

func main() {
  let sizes = [100, 1_000, 10_000, 100_000, 1_000_000, 10_000_000]
  for size in sizes {
    print("--- Size: \(size) ---")
    Benchmark(
      n: size,
      cycles: 100
    )
    print()
  }
}

main()

//
//  swift run -c release
//
//  --- Size: 100 ---
//  Sort: 1.907 μs
//  Min: 0.594 μs
//
//  --- Size: 1000 ---
//  Sort: 26.847 μs
//  Min: 2.236 μs
//
//  --- Size: 10000 ---
//  Sort: 375.711 μs
//  Min: 12.840 μs
//
//  --- Size: 100000 ---
//  Sort: 4,738.126 μs
//  Min: 109.005 μs
//
//  --- Size: 1000000 ---
//  Sort: 56,317.570 μs
//  Min: 1,045.142 μs
//
//  --- Size: 10000000 ---
//  Sort: 682,109.937 μs
//  Min: 10,462.117 μs
//

Checklist

  • I've added at least one test that validates that my change is working, if appropriate
  • I've followed the code style of the rest of the project
  • I've read the Contribution Guidelines
  • I've updated the documentation if necessary

Footnotes

  1. https://forums.swift.org/t/benchmarks-selection-linear-time-order-statistics/83852

  2. https://github.com/apple/swift-algorithms/blob/1.2.1/Sources/Algorithms/RandomSample.swift#L12-L41

// If we're attempting to prefix more than log n of the collection, it's
// faster to sort everything.
guard prefixCount < (self.count / 10) else {
guard prefixCount <= self.count._binaryLogarithm() else {
Copy link
Contributor

@xwu xwu Dec 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need underscored API for this:

Suggested change
guard prefixCount <= self.count._binaryLogarithm() else {
guard prefixCount <= (Int.bitWidth - self.count.leadingZeroBitCount) else {

@xwu
Copy link
Contributor

xwu commented Dec 30, 2025

The current implementation presented in this diff uses the underscored _binaryLogarithm method from Standard Library.

For a fixed width type like Int, the floor of log2(x) is the bit width minus the leading zero bit count and can be written out as such. There is no need to call an underscored API that does the same thing.

[Soroush Khanlou's research on this matter](https://khanlou.com/2018/12/analyzing-complexity/).
The total complexity is `O(k log k + nk)`, which will result in a runtime close
to `O(n)` if *k* is a small amount. If *k* is a large amount (more than 10% of
to `O(n)` if *k* is a small amount. If *k* is a large amount (more than log n of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Your implementation considers anything more than log2(n) to be “large” (not ln or log10); log2 isn’t one of the bases abbreviated “log”.

Copy link
Author

@vanvoorden vanvoorden Dec 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xwu Ahh… that's a good point!

TBH… I'm also open to hearing better ideas for how to compute this new threshold. Taking the binary logarithm guarantees our O(n log n) worst-case complexity but is also maybe more conservative than necessary. For an Array of one million elements this only gives us the first 19 elements before we sort the whole Array.

Adding one to the result of the binary logarithm gives us approximately log1.41 for the first 20 elements. We can keep going down that road and adding more… but I don't have a great answer how to analytically solve for what the optimal threshold should be that guarantees our O(n log n) worst-case complexity but keeps working on the "fast path" as much as possible.

@natecook1000
Copy link
Member

From those benchmarks it seems like there’s quite a bit of headroom to raise the threshold — are you able to do a bit of experimentation to find where that line actually is?

@vanvoorden
Copy link
Author

From those benchmarks it seems like there’s quite a bit of headroom to raise the threshold — are you able to do a bit of experimentation to find where that line actually is?

https://gist.github.com/vanvoorden/57a2a9768713907677f9fbb258e2c382

@natecook1000 Here are some experiments running different log bases against different collection sizes.

It looks like log1.25 performs better than sorting across the board.

It looks like log1.000244140625 performs worse than sorting across the board.

Picking something in between like log1.00390625 performs better than sorting on collections of 100K and larger and worse than sorting on collections of 10K and smaller.

BTW these are all collections of Int values. There could be different results if we compare Element values that have much more expensive comparisons because the memory footprint is much larger.

I'm not sure how else to analytically solve for the "optimal" threshold… any more ideas about that? Is this sort of thing usually found through trial and error and inductive reasoning or more like analytic deductive reasoning?

@mdznr mdznr changed the title solve for accidental quadratic performance from mix max methods solve for accidental quadratic performance from min max methods Jan 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants