'Review Request 110262: KRandomSequence::randomize: use the Fisher-Yates Algorithm to achieve O(N) co'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kde-core-devel
Subject:    Review Request 110262: KRandomSequence::randomize: use the Fisher-Yates Algorithm to achieve O(N) co
From:       "Frank Reininghaus" <frank78ac () googlemail ! com>
Date:       2013-05-01 20:34:48
Message-ID: 20130501203448.24174.54908 () vidsolbach ! de
[Download RAW message or body]

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://git.reviewboard.kde.org/r/110262/
-----------------------------------------------------------

Review request for kdelibs.


Description
-------

The current algorithm that is used to shuffle lists is rather inefficient. It works \
by removing the first item of the list repeatedly and inserting it at a random \
position in a new list, which is finally used to replace the original list. \
Unfortunately, this results in O(N^2) run time complexity because inserting into a \
list, which is done N itmes, is O(N).

I propose to replace this algorithm by the Fisher-Yates algorithm, which works by \
swapping items N - 1 times. One could modify the entire thing further, like providing \
randomization also for other containers and not only QList, but that would probably \
be frameworks material. 


Diffs
-----

  kdecore/util/krandomsequence.h 46949b4 

Diff: http://git.reviewboard.kde.org/r/110262/diff/


Testing
-------

I wrote a small benchmark: http://paste.kde.org/735914/

I got the following results with the existing algorithm:

RESULT : Benchmark::randomSequenceBenchmark():"n=0":
     0.000015 msecs per iteration (total: 66, iterations: 4194304)
RESULT : Benchmark::randomSequenceBenchmark():"n=1":
     0.000192 msecs per iteration (total: 101, iterations: 524288)
RESULT : Benchmark::randomSequenceBenchmark():"n=3":
     0.00070 msecs per iteration (total: 93, iterations: 131072)
RESULT : Benchmark::randomSequenceBenchmark():"n=10":
     0.0025 msecs per iteration (total: 83, iterations: 32768)
RESULT : Benchmark::randomSequenceBenchmark():"n=30":
     0.0070 msecs per iteration (total: 58, iterations: 8192)
RESULT : Benchmark::randomSequenceBenchmark():"n=100":
     0.023 msecs per iteration (total: 97, iterations: 4096)
RESULT : Benchmark::randomSequenceBenchmark():"n=300":
     0.077 msecs per iteration (total: 79, iterations: 1024)
RESULT : Benchmark::randomSequenceBenchmark():"n=1000":
     0.35 msecs per iteration (total: 90, iterations: 256)
RESULT : Benchmark::randomSequenceBenchmark():"n=3000":
     1.8 msecs per iteration (total: 58, iterations: 32)
RESULT : Benchmark::randomSequenceBenchmark():"n=10000":
     18 msecs per iteration (total: 72, iterations: 4)
RESULT : Benchmark::randomSequenceBenchmark():"n=30000":
     283 msecs per iteration (total: 283, iterations: 1)
RESULT : Benchmark::randomSequenceBenchmark():"n=100000":
     3,823 msecs per iteration (total: 3,823, iterations: 1)

Here are the numbers for the proposed new algorithm:

RESULT : Benchmark::randomSequenceBenchmark():"n=0":
     0.000015 msecs per iteration (total: 65, iterations: 4194304)
RESULT : Benchmark::randomSequenceBenchmark():"n=1":
     0.000015 msecs per iteration (total: 65, iterations: 4194304)
RESULT : Benchmark::randomSequenceBenchmark():"n=3":
     0.00018 msecs per iteration (total: 98, iterations: 524288)
RESULT : Benchmark::randomSequenceBenchmark():"n=10":
     0.00079 msecs per iteration (total: 52, iterations: 65536)
RESULT : Benchmark::randomSequenceBenchmark():"n=30":
     0.0025 msecs per iteration (total: 83, iterations: 32768)
RESULT : Benchmark::randomSequenceBenchmark():"n=100":
     0.0084 msecs per iteration (total: 69, iterations: 8192)
RESULT : Benchmark::randomSequenceBenchmark():"n=300":
     0.025 msecs per iteration (total: 52, iterations: 2048)
RESULT : Benchmark::randomSequenceBenchmark():"n=1000":
     0.085 msecs per iteration (total: 88, iterations: 1024)
RESULT : Benchmark::randomSequenceBenchmark():"n=3000":
     0.25 msecs per iteration (total: 66, iterations: 256)
RESULT : Benchmark::randomSequenceBenchmark():"n=10000":
     0.85 msecs per iteration (total: 55, iterations: 64)
RESULT : Benchmark::randomSequenceBenchmark():"n=30000":
     2.6 msecs per iteration (total: 86, iterations: 32)
RESULT : Benchmark::randomSequenceBenchmark():"n=100000":
     10 msecs per iteration (total: 81, iterations: 8)


Thanks,

Frank Reininghaus


[Attachment #3 (text/html)]

<html>
 <body>
  <div style="font-family: Verdana, Arial, Helvetica, Sans-Serif;">
   <table bgcolor="#f9f3c9" width="100%" cellpadding="8" style="border: 1px #c9c399 \
solid;">  <tr>
     <td>
      This is an automatically generated e-mail. To reply, visit:
      <a href="http://git.reviewboard.kde.org/r/110262/">http://git.reviewboard.kde.org/r/110262/</a>
  </td>
    </tr>
   </table>
   <br />




<table bgcolor="#fefadf" width="100%" cellspacing="0" cellpadding="8" \
style="background-image: \
url('http://git.reviewboard.kde.org/static/rb/images/review_request_box_top_bg.ab6f3b1072c9.png'); \
background-position: left top; background-repeat: repeat-x; border: 1px black \
solid;">  <tr>
  <td>

<div>Review request for kdelibs.</div>
<div>By Frank Reininghaus.</div>







<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Description </h1>
 <table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" \
style="border: 1px solid #b8b5a0">  <tr>
  <td>
   <pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: \
-moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">The current algorithm that is used to shuffle lists is rather \
inefficient. It works by removing the first item of the list repeatedly and inserting \
it at a random position in a new list, which is finally used to replace the original \
list. Unfortunately, this results in O(N^2) run time complexity because inserting \
into a list, which is done N itmes, is O(N).

I propose to replace this algorithm by the Fisher-Yates algorithm, which works by \
swapping items N - 1 times. One could modify the entire thing further, like providing \
randomization also for other containers and not only QList, but that would probably \
be frameworks material. </pre>  </td>
 </tr>
</table>


<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Testing </h1>
<table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" style="border: \
1px solid #b8b5a0">  <tr>
  <td>
   <pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: \
-moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">I wrote a small benchmark: http://paste.kde.org/735914/

I got the following results with the existing algorithm:

RESULT : Benchmark::randomSequenceBenchmark():&quot;n=0&quot;:
     0.000015 msecs per iteration (total: 66, iterations: 4194304)
RESULT : Benchmark::randomSequenceBenchmark():&quot;n=1&quot;:
     0.000192 msecs per iteration (total: 101, iterations: 524288)
RESULT : Benchmark::randomSequenceBenchmark():&quot;n=3&quot;:
     0.00070 msecs per iteration (total: 93, iterations: 131072)
RESULT : Benchmark::randomSequenceBenchmark():&quot;n=10&quot;:
     0.0025 msecs per iteration (total: 83, iterations: 32768)
RESULT : Benchmark::randomSequenceBenchmark():&quot;n=30&quot;:
     0.0070 msecs per iteration (total: 58, iterations: 8192)
RESULT : Benchmark::randomSequenceBenchmark():&quot;n=100&quot;:
     0.023 msecs per iteration (total: 97, iterations: 4096)
RESULT : Benchmark::randomSequenceBenchmark():&quot;n=300&quot;:
     0.077 msecs per iteration (total: 79, iterations: 1024)
RESULT : Benchmark::randomSequenceBenchmark():&quot;n=1000&quot;:
     0.35 msecs per iteration (total: 90, iterations: 256)
RESULT : Benchmark::randomSequenceBenchmark():&quot;n=3000&quot;:
     1.8 msecs per iteration (total: 58, iterations: 32)
RESULT : Benchmark::randomSequenceBenchmark():&quot;n=10000&quot;:
     18 msecs per iteration (total: 72, iterations: 4)
RESULT : Benchmark::randomSequenceBenchmark():&quot;n=30000&quot;:
     283 msecs per iteration (total: 283, iterations: 1)
RESULT : Benchmark::randomSequenceBenchmark():&quot;n=100000&quot;:
     3,823 msecs per iteration (total: 3,823, iterations: 1)

Here are the numbers for the proposed new algorithm:

RESULT : Benchmark::randomSequenceBenchmark():&quot;n=0&quot;:
     0.000015 msecs per iteration (total: 65, iterations: 4194304)
RESULT : Benchmark::randomSequenceBenchmark():&quot;n=1&quot;:
     0.000015 msecs per iteration (total: 65, iterations: 4194304)
RESULT : Benchmark::randomSequenceBenchmark():&quot;n=3&quot;:
     0.00018 msecs per iteration (total: 98, iterations: 524288)
RESULT : Benchmark::randomSequenceBenchmark():&quot;n=10&quot;:
     0.00079 msecs per iteration (total: 52, iterations: 65536)
RESULT : Benchmark::randomSequenceBenchmark():&quot;n=30&quot;:
     0.0025 msecs per iteration (total: 83, iterations: 32768)
RESULT : Benchmark::randomSequenceBenchmark():&quot;n=100&quot;:
     0.0084 msecs per iteration (total: 69, iterations: 8192)
RESULT : Benchmark::randomSequenceBenchmark():&quot;n=300&quot;:
     0.025 msecs per iteration (total: 52, iterations: 2048)
RESULT : Benchmark::randomSequenceBenchmark():&quot;n=1000&quot;:
     0.085 msecs per iteration (total: 88, iterations: 1024)
RESULT : Benchmark::randomSequenceBenchmark():&quot;n=3000&quot;:
     0.25 msecs per iteration (total: 66, iterations: 256)
RESULT : Benchmark::randomSequenceBenchmark():&quot;n=10000&quot;:
     0.85 msecs per iteration (total: 55, iterations: 64)
RESULT : Benchmark::randomSequenceBenchmark():&quot;n=30000&quot;:
     2.6 msecs per iteration (total: 86, iterations: 32)
RESULT : Benchmark::randomSequenceBenchmark():&quot;n=100000&quot;:
     10 msecs per iteration (total: 81, iterations: 8)
</pre>
  </td>
 </tr>
</table>




<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Diffs</b> </h1>
<ul style="margin-left: 3em; padding-left: 0;">

 <li>kdecore/util/krandomsequence.h <span style="color: grey">(46949b4)</span></li>

</ul>

<p><a href="http://git.reviewboard.kde.org/r/110262/diff/" style="margin-left: \
3em;">View Diff</a></p>







  </td>
 </tr>
</table>




  </div>
 </body>
</html>



[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic