Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quick sort: Replace recursion with custom stack, small improvements #84

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Commits on Dec 12, 2022

  1. Quick sort: Replace recursion with custom stack, small improvements

    Instead of recursing just roll a custom stack with low/high bounds
    of the next region.
    
    Also tune some of the logic a bit.
        - Simpler (and faster) median + setup for partition
        - Remove some unnecessary branches in hot control flow.
    
    Results in roughly 10% perf improvement on the project benchmarks:
    (See PR for full run data)
    
    4027959.8 / 4258144.7 -> 0.9459
    Quick_sort 100000 x86_64                  249        4027959.8 ns/op
    Quick_sort 100000 x86_64                  235        4258144.7 ns/op
    
    Running tests with random numbers: 902582.0 / 940650.0 -> 0.9595
    sort.h quick_sort             - ok,   902582.0 usec
    sort.h quick_sort             - ok,   940650.0 usec
    
    Running tests with same number: 8986.0 / 9059.0 -> 0.9919
    sort.h quick_sort             - ok,     8986.0 usec
    sort.h quick_sort             - ok,     9059.0 usec
    
    Running tests with sorted numbers: 148790.0 / 160015.0 -> 0.9299
    sort.h quick_sort             - ok,   148790.0 usec
    sort.h quick_sort             - ok,   160015.0 usec
    
    Running tests with sorted blocks of length 10: 872430.0 / 915431.0 -> 0.953
    sort.h quick_sort             - ok,   872430.0 usec
    sort.h quick_sort             - ok,   915431.0 usec
    
    Running tests with sorted blocks of length 100: 751763.0 / 791987.0 -> 0.9492
    sort.h quick_sort             - ok,   751763.0 usec
    sort.h quick_sort             - ok,   791987.0 usec
    
    Running tests with sorted blocks of length 10000: 461118.0 / 514853.0 -> 0.8956
    sort.h quick_sort             - ok,   461118.0 usec
    sort.h quick_sort             - ok,   514853.0 usec
    
    Running tests with swapped size/2 pairs: 812161.0 / 854230.0 -> 0.9508
    sort.h quick_sort             - ok,   812161.0 usec
    sort.h quick_sort             - ok,   854230.0 usec
    
    Running tests with swapped size/8 pairs: 522638.0 / 575848.0 -> 0.9076
    sort.h quick_sort             - ok,   522638.0 usec
    sort.h quick_sort             - ok,   575848.0 usec
    
    Running tests with known evil data: 146601.0 / 196450.0 -> 0.7463
    sort.h quick_sort             - ok,   146601.0 usec
    sort.h quick_sort             - ok,   196450.0 usec
    
    So roughly a 5-10% for most cases with the outliers being no-change
    for same-number and 25% improvement for "evil data".
    goldsteinn committed Dec 12, 2022
    Configuration menu
    Copy the full SHA
    5d0068a View commit details
    Browse the repository at this point in the history