quicksort cost too much time in some case #62

Baobaobear · 2019-09-17T16:14:42Z

qsort_data.txt.gz
Here is the test data. Stack overflow on VS2015

swenson · 2019-09-20T03:18:19Z

Thanks for the report. I agree that our quick sort is unacceptably slow for that data. I'll see if I can reproduce the crash in VS2015 as well, and start working on a fix, probably by implementing some needed improvements to our quick sort.

swenson · 2019-09-20T04:38:05Z

Hmm. I'm not able to reproduce in VS2015 in either x64 or x86. Can you give me the command-line options, or any other relevant configuration information for your build?

It appears as though the stacks are smaller in Windows, so I have some ideas for a fix, but I'd like to be able to reproduce it if possible.

Baobaobear · 2019-09-20T04:58:00Z

OK, I made the longer data here.
qsort_data.txt.gz

Baobaobear · 2019-09-20T05:07:21Z

And also, I found a case which std::sort in VS/clang running slower than normal.
I paste code here https://paste.ubuntu.com/p/sVBVdjSnyv/

Before, sorting `evil.txt` would have a massive 9,000+ stack size. Now it has a stack size of only 30 or so. We do this by, rather than making two recursive calls to quick sort, we only do a recursive call on the smaller half. For the larger half, we replace the left and right values in the current call and loop. This way, if a poor choice of pivot is made and, for example, we partition a block of size N into pieces 1 and N-2, we don't create a new stack entry for N-2, but instead reuse the current one. Fixes #62 This also fixes a unary minus on an unsigned integer, which creates problems for some compilers (notably MSVC).

swenson · 2019-09-21T03:50:52Z

I believe I have fixed this issue in #63 -- at least, it should no longer crash by overflowing the stack, as it does a neat manual tail call optimization I figured out, reducing the stack size from 9,000+ to 30 or so.

It looks like Windows uses a default stack size of 1 MB (and Linux uses 8 MB), which is why this cropped up first.

This improvement doesn't fix the performance, but it does limit the stack usage to be O(log N) instead of O(N) in the worst case.

I will work on improving the performance for this test case next.

Thank you for bringing this to my attention!

Baobaobear · 2019-09-21T10:41:53Z

To avoid the worst case, mixing heapsort turn into introsort is necessary

Baobaobear · 2019-09-22T07:26:19Z

Here is my test result on Centos7 x64, build with g++ 8.3.1

sorting 8000000 int

test	1	2	3	4	5	6	7	8	9	10	11	score
bao_qsort	8	129	160	580	590	140	82	167	129	241	557	278
bao_tim	4	59	11	692	694	336	153	11	101	213	673	294
std_sort	110	124	130	565	568	229	210	141	261	231	550	311
bao_merge	9	37	148	726	701	347	171	93	126	226	709	329
std_stable	174	169	184	693	687	364	206	181	202	283	675	381
sw_tim	4	38	310	894	887	524	337	153	119	275	878	441
bao_shell	118	112	178	1002	983	403	146	133	346	232	969	462
sw_merge	254	257	253	823	807	465	307	242	242	381	798	482
sw_sqrt	272	242	266	987	992	547	351	240	372	362	944	557
sw_shell	178	204	263	1247	1252	479	231	190	390	326	1168	592
sw_grail	132	269	333	1008	1036	802	391	270	743	376	967	632
std_qsort	304	313	355	1033	1008	679	470	335	415	483	998	639
sw_qsort	7	245	251	683	671	174	410	299	3413	367	668	718
std_heap	483	454	537	1904	1882	628	584	428	1196	526	1909	1053
bao_heap	31	544	643	1987	1949	577	576	494	1216	630	1960	1060
sw_stable	115	210	564	3570	3646	1124	517	250	775	592	3361	1472
sw_heap	28	842	925	4603	4550	921	944	842	3084	843	4597	2217

perfix with "std_" is std implemention, std_sort == std::sort, std_stable == std::stable, std_qsort == qsort in stdlib.h
perfix with "sw_" is your implemention
perfix with "bao_" is mine, more detail see my new repo

column 1 to 11 are test cases. In test case 9, we can see your qsort face to some issues

I believe this can help you do a huge improvement

swenson · 2019-09-22T15:21:53Z

Awesome. I will take a look sometime soon and see about improving performance even further.

swenson self-assigned this Sep 20, 2019

swenson mentioned this issue Sep 21, 2019

Add quick sort manual tail-call optimization #63

Merged

swenson closed this as completed in #63 Sep 21, 2019

swenson reopened this Sep 22, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

quicksort cost too much time in some case #62

quicksort cost too much time in some case #62

Baobaobear commented Sep 17, 2019

swenson commented Sep 20, 2019

swenson commented Sep 20, 2019

Baobaobear commented Sep 20, 2019

Baobaobear commented Sep 20, 2019

swenson commented Sep 21, 2019

Baobaobear commented Sep 21, 2019

Baobaobear commented Sep 22, 2019

swenson commented Sep 22, 2019

quicksort cost too much time in some case #62

quicksort cost too much time in some case #62

Comments

Baobaobear commented Sep 17, 2019

swenson commented Sep 20, 2019

swenson commented Sep 20, 2019

Baobaobear commented Sep 20, 2019

Baobaobear commented Sep 20, 2019

swenson commented Sep 21, 2019

Baobaobear commented Sep 21, 2019

Baobaobear commented Sep 22, 2019

swenson commented Sep 22, 2019