[prev in list] [next in list] [prev in thread] [next in thread] 

List:       coreutils-bug
Subject:    bug#7597: [coreutils] multi-threaded sort can segfault (unrelated
From:       Jim Meyering <jim () meyering ! net>
Date:       2010-12-09 21:33:42
Message-ID: 87r5dq7hs9.fsf () meyering ! net
[Download RAW message or body]

Jim Meyering wrote:
...
> With that, I've solved at least part of the problem.
> The segfault (and other strangeness we've witnessed)
> arises because each "node" struct is stored on the stack,
> and its address ends up being used by another thread after
> the thread that owns the stack in question has been "joined".
>
> My solution is to use the heap instead of the stack.
> However, for today I'm out of time and I have not yet found a
> way to free these newly-malloc'd "node" buffers.
>
> To test this, I've done the following:
>
> gensort -a 10000 > gensort-10k
> for i in $(seq 2000); do printf '% 4d\n' $i; valgrind --quiet src/sort -S 100K \
>   --parallel=2 gensort-10k > k; test $(wc -c < k) = 1000000 || break; done
> for i in $(seq 2000); do printf '% 4d\n' $i; src/sort -S 100K \
>   --parallel=2 gensort-10k > j; test $(wc -c < j) = 1000000 || break; done
>
> Without the patch, the first would show errors for more than 50% of
> the runs and the second would rarely get to i=100 without generating
> a core file.  With the patch, both complete error-free (not counting
> leaks).

FYI, while preparing a test, I've found that the latter test
(without valgrind) passes 2000/2000 tests when compiled with -g -O2,
yet fails in at least 10 of the 2000 when compiled with -ggdb3.



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic