[prev in list] [next in list] [prev in thread] [next in thread] 

List:       glibc-alpha
Subject:    Re: qsort improvements
From:       Adhemerval Zanella via Libc-alpha <libc-alpha () sourceware ! org>
Date:       2021-08-31 18:11:22
Message-ID: f843b497-2619-d87a-0b87-6cb4f4867d0e () linaro ! org
[Download RAW message or body]



On 31/08/2021 14:52, Paul Eggert wrote:
> On 8/31/21 4:55 AM, Adhemerval Zanella wrote:
> 
> > I think about using introsort using current quicksort along with the
> > optimization to optimize the element swaps.   On the patchset I sent
> > I added some benchmarks and although it can't be mergesort as expected,
> > I think the performance tradeoff is acceptable.
> 
> Oh, I wasn't aware of the azanella/qsort-refactor branch.
> 
> Looking at it briefly, I have my doubts about the performance tradeoff, as heapsort \
> (and therefore introsort) can have reasonably bad performance for some important \
> special cases that may not have been benchmarked there.

I haven't update it with my latest work, and I think it would be possible
to simplify the code a bit.

> 
> Memory allocation within the Linux kernel is understandably frowned upon, so it \
> makes sense to avoid memory-hungry algorithms like mergesort there. In contrast, \
> it's generally better to trade space for speed in user programs.

I tend to agree with you, but I still think that for generic library such 
as glibc I strongly prefer we focus on code complexity, a well defined 
semantic, and QoI such lower memory consumption.  As I said, I don't think
having a state-of-the-art, but complex and workload specific is the best way
forward.

> 
> I believe this is why Swift changed from introsort to timsort a couple of years \
> ago. See some benchmarks here, which make timsort look pretty good compared to \
> introsort, at least in the Swift ecosystem: 
> https://gist.github.com/natecook1000/5161e10aeba09408c130284ea6ec4e11
> 

Afaik it is used on other implementation as well on Java (although as far as
I know not for generic implementation).

> Also, I doubt whether async-signal-safety is an important attribute for qsort: \
> portable code can't call qsort from signal handlers anyway, and I doubt whether \
> much significant unportable code needs that feature in glibc.

And yet users do call malloc after fork (which we still have some code to
support it), non async-safe-safe function after vfork(), among other non
portable combinations.  I think for a libc implementation adhering to
memory consumption constraints, along with some constraints like thread
safety and sync-signal-safeness *where* possible is a QoI that we should
adhere.

If user do want a more tuned sorting algorithm, I think we would be better
to either provide is as extension (as FreeBSD does for mergesort,
heapsort [1], and radixsort [2]) or provide through a compat layer such
as gnulib.

[1] https://www.freebsd.org/cgi/man.cgi?query=mergesort&sektion=3
[2] https://www.freebsd.org/cgi/man.cgi?query=radixsort&sektion=3&apropos=0&manpath=FreeBSD+13.0-RELEASE+and+Ports



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic