Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quicksort cost too much time in some case #62

Open
Baobaobear opened this issue Sep 17, 2019 · 8 comments · Fixed by #63
Open

quicksort cost too much time in some case #62

Baobaobear opened this issue Sep 17, 2019 · 8 comments · Fixed by #63
Assignees

Comments

@Baobaobear
Copy link
Contributor

qsort_data.txt.gz
Here is the test data. Stack overflow on VS2015

@swenson
Copy link
Owner

swenson commented Sep 20, 2019

Thanks for the report. I agree that our quick sort is unacceptably slow for that data. I'll see if I can reproduce the crash in VS2015 as well, and start working on a fix, probably by implementing some needed improvements to our quick sort.

@swenson swenson self-assigned this Sep 20, 2019
@swenson
Copy link
Owner

swenson commented Sep 20, 2019

Hmm. I'm not able to reproduce in VS2015 in either x64 or x86. Can you give me the command-line options, or any other relevant configuration information for your build?

It appears as though the stacks are smaller in Windows, so I have some ideas for a fix, but I'd like to be able to reproduce it if possible.

@Baobaobear
Copy link
Contributor Author

OK, I made the longer data here.
qsort_data.txt.gz

@Baobaobear
Copy link
Contributor Author

And also, I found a case which std::sort in VS/clang running slower than normal.
I paste code here https://paste.ubuntu.com/p/sVBVdjSnyv/

swenson added a commit that referenced this issue Sep 21, 2019
Before, sorting `evil.txt` would have a massive 9,000+ stack size.
Now it has a stack size of only 30 or so.

We do this by, rather than making two recursive calls to quick sort,
we only do a recursive call on the smaller half. For the larger half,
we replace the left and right values in the current call and loop.
This way, if a poor choice of pivot is made and, for example, we
partition a block of size N into pieces 1 and N-2, we don't create
a new stack entry for N-2, but instead reuse the current one.

Fixes #62

This also fixes a unary minus on an unsigned integer, which creates
problems for some compilers (notably MSVC).
@swenson
Copy link
Owner

swenson commented Sep 21, 2019

I believe I have fixed this issue in #63 -- at least, it should no longer crash by overflowing the stack, as it does a neat manual tail call optimization I figured out, reducing the stack size from 9,000+ to 30 or so.

It looks like Windows uses a default stack size of 1 MB (and Linux uses 8 MB), which is why this cropped up first.

This improvement doesn't fix the performance, but it does limit the stack usage to be O(log N) instead of O(N) in the worst case.

I will work on improving the performance for this test case next.

Thank you for bringing this to my attention!

@Baobaobear
Copy link
Contributor Author

To avoid the worst case, mixing heapsort turn into introsort is necessary

@Baobaobear
Copy link
Contributor Author

Here is my test result on Centos7 x64, build with g++ 8.3.1

sorting 8000000 int

test 1 2 3 4 5 6 7 8 9 10 11 score
bao_qsort 8 129 160 580 590 140 82 167 129 241 557 278
bao_tim 4 59 11 692 694 336 153 11 101 213 673 294
std_sort 110 124 130 565 568 229 210 141 261 231 550 311
bao_merge 9 37 148 726 701 347 171 93 126 226 709 329
std_stable 174 169 184 693 687 364 206 181 202 283 675 381
sw_tim 4 38 310 894 887 524 337 153 119 275 878 441
bao_shell 118 112 178 1002 983 403 146 133 346 232 969 462
sw_merge 254 257 253 823 807 465 307 242 242 381 798 482
sw_sqrt 272 242 266 987 992 547 351 240 372 362 944 557
sw_shell 178 204 263 1247 1252 479 231 190 390 326 1168 592
sw_grail 132 269 333 1008 1036 802 391 270 743 376 967 632
std_qsort 304 313 355 1033 1008 679 470 335 415 483 998 639
sw_qsort 7 245 251 683 671 174 410 299 3413 367 668 718
std_heap 483 454 537 1904 1882 628 584 428 1196 526 1909 1053
bao_heap 31 544 643 1987 1949 577 576 494 1216 630 1960 1060
sw_stable 115 210 564 3570 3646 1124 517 250 775 592 3361 1472
sw_heap 28 842 925 4603 4550 921 944 842 3084 843 4597 2217

perfix with "std_" is std implemention, std_sort == std::sort, std_stable == std::stable, std_qsort == qsort in stdlib.h
perfix with "sw_" is your implemention
perfix with "bao_" is mine, more detail see my new repo

column 1 to 11 are test cases. In test case 9, we can see your qsort face to some issues

I believe this can help you do a huge improvement

@swenson swenson reopened this Sep 22, 2019
@swenson
Copy link
Owner

swenson commented Sep 22, 2019

Awesome. I will take a look sometime soon and see about improving performance even further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants