In-Place Timsort? #25

haneefmubarak · 2014-08-17T07:01:16Z

I was wondering if it would be possible to do an in-place Timsort. I was thinking something along the lines of doing the run finding stage at the beginning followed by an in-place mergesort.

I've yet to write it up (I intend to), but I did some extensive tests using the sorts here against 1+ GB of data (same data for all tests - fetched once from /dev/urandom`). What I found, unsurprisingly, was that you can use multithreading to speed up the process (I suppose for large volumes you could even go hadoop-style Big Data). When multithreading, nested mergesorts were faster than any other sorts.

What did surprise me, though, was that an in-place mergesort was the fastest. Timsort came second, and normal mergesort third, but still. It came to me, however, that the reason was due to cache locality.

I was wondering though, if we could do the run finding from Timsort and then do an in-place mergesort, perhaps that would be the fastest? It could take advantage of cache locality AND pre-existing runs.

haneefmubarak · 2014-08-17T07:01:26Z

Thanks either way.

swenson · 2014-11-20T02:40:39Z

I think this is a great idea. If you would like to submit code to try for this, I would be happy to review it. Otherwise, I'll think about it and work on it sometime.

swenson added the potential research area label Sep 22, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

In-Place Timsort? #25

In-Place Timsort? #25

haneefmubarak commented Aug 17, 2014

haneefmubarak commented Aug 17, 2014

swenson commented Nov 20, 2014

In-Place Timsort? #25

In-Place Timsort? #25

Comments

haneefmubarak commented Aug 17, 2014

haneefmubarak commented Aug 17, 2014

swenson commented Nov 20, 2014