Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add quick sort manual tail-call optimization #63

Merged
merged 3 commits into from
Sep 21, 2019
Merged

Conversation

swenson
Copy link
Owner

@swenson swenson commented Sep 21, 2019

Before, sorting evil.txt would have a massive 9,000+ stack size.
Now it has a stack size of only 30 or so.

We do this by, rather than making two recursive calls to quick sort,
we only do a recursive call on the smaller half. For the larger half,
we replace the left and right values in the current call and loop.
This way, if a poor choice of pivot is made and, for example, we
partition a block of size N into pieces 1 and N-2, we don't create
a new stack entry for N-2, but instead reuse the current one.

Fixes #62

This also fixes a unary minus on an unsigned integer, which creates
problems for some compilers (notably MSVC).

Before, sorting `evil.txt` would have a massive 9,000+ stack size.
Now it has a stack size of only 30 or so.

We do this by, rather than making two recursive calls to quick sort,
we only do a recursive call on the smaller half. For the larger half,
we replace the left and right values in the current call and loop.
This way, if a poor choice of pivot is made and, for example, we
partition a block of size N into pieces 1 and N-2, we don't create
a new stack entry for N-2, but instead reuse the current one.

Fixes #62

This also fixes a unary minus on an unsigned integer, which creates
problems for some compilers (notably MSVC).
@Baobaobear
Copy link
Contributor

I found a much easier way to create evil data. Just let dst[i] = i ^ 1;
Your partition implemention is the most slow one as I known. The fastest one as I known is #43 . This guy said was correct.

@swenson
Copy link
Owner Author

swenson commented Sep 21, 2019

Oh, good call. I will work on fixing the partitioning and use the simpler "evil" data.

Based on my last comment of #43, it seems like there was no increase in performance with that version. I'll re-evaluate based on the evil data later today.

@swenson
Copy link
Owner Author

swenson commented Sep 21, 2019

Okay, weirdly enough, the original Hoare partitioning (in #43) is much slower than the current (Lomuto) partitioning algorithm in use for the original "evil" data, but it is faster for most of the other cases.

I think a solution will be combination of Hoare partitioning and switching to Heap sort if we are looping too many times.

By using a combination of:

* the current partitioning scheme (Lomuto, rather than
  switching to the Hoare original algorithm),
* using a pivot algorithm by finding the median of 5 evenly spaced
  points between left and right,
* doing manual tail-call optimization rather than the standard doubly
  recursive method, and
* switching to heap sort if we've exceeded ~lg N loops/calls

we seem to be able to consistently beat the standard C library's
`qsort` consistently, even in the evil cases.
@swenson
Copy link
Owner Author

swenson commented Sep 21, 2019

By using a combination of:

  • the current partitioning scheme (Lomuto, rather than
    switching to the Hoare original algorithm),
  • using a pivot algorithm by finding the median of 5 evenly spaced
    points between left and right,
  • doing manual tail-call optimization rather than the standard doubly
    recursive method, and
  • switching to heap sort if we've exceeded ~lg N loops/calls

we seem to be able to consistently beat the standard C library's qsort consistently, even in the evil cases.

I'll merge it once tests pass. Let me know if you have any further feedback, or if you see any other problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

quicksort cost too much time in some case
2 participants