[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kfm-devel
Subject:    Re: Review Request 123620: Fix sorting performance regression by avoiding unnecessary copying of the
From:       "Frank Reininghaus" <frank78ac () googlemail ! com>
Date:       2015-05-07 20:22:20
Message-ID: 20150507202220.16998.31660 () mimi ! kde ! org
[Download RAW message or body]

--===============1751863017328764487==
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/123620/
-----------------------------------------------------------

(Updated May 7, 2015, 8:22 p.m.)


Status
------

This change has been marked as submitted.


Review request for Dolphin.


Changes
-------

Submitted with commit e69d3489751ea15c0477fe1d41fcc252c775d377 by Frank Reininghaus \
to branch master.


Repository: dolphin


Description
-------

I have found that the sorting performance when loading large folders is much worse \
than in the kdelibs 4.x-based Dolphin versions. The reason is the switch from \
KStringHandler::naturalCompare(...) to QCollator, which made creating copies of the \
QCollator object necessary (see https://git.reviewboard.kde.org/r/121817/) in order \
to fix crashes because QCollator's functions are, unlike \
KStringHandler::naturalCompare(...), not thread-safe.

This patch does the following:

1. Simplify the benchmarks in kfileitemmodelbenchmark. Currently, the benchmarks are \
run for many different folder sizes, which was motivated by the O(N^2) complexity \
which earlier Dolphin versions had if many files with some specific patterns were \
added. Since these issues have been fixed a long time ago, I think that it makes \
sense to run the benchmarks for a single folder size only. This makes it much easier \
to understand the benchmark results at first sight and identify any performance \
regressions.

2. Prevent that unnecessary expensive copies of the QCollator object are made. A copy \
is only needed if a new thread is forked. In all other cases, the "lessThan" object, \
which contains the QCollator, can be passed by const reference. I achieved that with \
rather small code changes, but they are a bit ugly - the only way to force the \
std::lower_bound and std::upper_bound functions to take a const reference was to \
specify the template arguments explicitly.

My next plans would be to add a benchmark which actually benchmarks natural sorting \
(the current benchmark disables it because its intention was to test the code that \
inserts items into the model). This could help to investigate if making use of \
QCollator::sortKey() might speed up sorting. Moreover, I think it might sense to move \
some or all of the lessThan-related code out of the KFileItemModel class to make it \
more maintainable.


Diffs
-----

  src/kitemviews/private/kfileitemmodelsortalgorithm.h 50db990 
  src/tests/kfileitemmodelbenchmark.cpp 3ff0405 

Diff: https://git.reviewboard.kde.org/r/123620/diff/


Testing
-------

Unit tests still pass. I have verified (with some debug output in \
KFileItemModelLessThan::operator()) that different threads use different collator \
objects.

Benchmark results before/after this patch (with Qt, KF5 and Dolphin built in release \
mode) show that the performance is improved considerably:

Before:

********* Start testing of KFileItemModelBenchmark *********
Config: Using QtTest library 5.4.1, Qt 5.4.1 (x86_64-little_endian-lp64 shared \
(dynamic) release build; by GCC 4.8.3 20140627 [gcc-4_8-branch revision 212064]) PASS \
: KFileItemModelBenchmark::initTestCase() PASS   : \
KFileItemModelBenchmark::insertAndRemoveManyItems(all--n=100000) RESULT : \
KFileItemModelBenchmark::insertAndRemoveManyItems():"all--n=100000":  2,575 msecs per \
iteration (total: 2,575, iterations: 1) PASS   : \
KFileItemModelBenchmark::insertAndRemoveManyItems(1st half + 2nd half--n=100000) \
RESULT : KFileItemModelBenchmark::insertAndRemoveManyItems():"1st half + 2nd \
half--n=100000":  2,596 msecs per iteration (total: 2,596, iterations: 1)
PASS   : KFileItemModelBenchmark::insertAndRemoveManyItems(2nd half + 1st \
half--n=100000) RESULT : KFileItemModelBenchmark::insertAndRemoveManyItems():"2nd \
half + 1st half--n=100000":  2,594 msecs per iteration (total: 2,594, iterations: 1)
PASS   : KFileItemModelBenchmark::insertAndRemoveManyItems(even + odd--n=100000)
RESULT : KFileItemModelBenchmark::insertAndRemoveManyItems():"even + odd--n=100000":
     2,609 msecs per iteration (total: 2,609, iterations: 1)
PASS   : KFileItemModelBenchmark::insertAndRemoveManyItems(all - 2nd half--n=100000)
RESULT : KFileItemModelBenchmark::insertAndRemoveManyItems():"all - 2nd \
half--n=100000":  2,626 msecs per iteration (total: 2,626, iterations: 1)
PASS   : KFileItemModelBenchmark::insertAndRemoveManyItems(all - 1st half--n=100000)
RESULT : KFileItemModelBenchmark::insertAndRemoveManyItems():"all - 1st \
half--n=100000":  2,610 msecs per iteration (total: 2,610, iterations: 1)
PASS   : KFileItemModelBenchmark::insertAndRemoveManyItems(all - odd--n=100000)
RESULT : KFileItemModelBenchmark::insertAndRemoveManyItems():"all - odd--n=100000":
     2,636 msecs per iteration (total: 2,636, iterations: 1)
PASS   : KFileItemModelBenchmark::cleanupTestCase()
Totals: 9 passed, 0 failed, 0 skipped, 0 blacklisted
********* Finished testing of KFileItemModelBenchmark *********


After:

********* Start testing of KFileItemModelBenchmark *********
Config: Using QtTest library 5.4.1, Qt 5.4.1 (x86_64-little_endian-lp64 shared \
(dynamic) release build; by GCC 4.8.3 20140627 [gcc-4_8-branch revision 212064]) PASS \
: KFileItemModelBenchmark::initTestCase() PASS   : \
KFileItemModelBenchmark::insertAndRemoveManyItems(all--n=100000) RESULT : \
KFileItemModelBenchmark::insertAndRemoveManyItems():"all--n=100000":  21 msecs per \
iteration (total: 87, iterations: 4) PASS   : \
KFileItemModelBenchmark::insertAndRemoveManyItems(1st half + 2nd half--n=100000) \
RESULT : KFileItemModelBenchmark::insertAndRemoveManyItems():"1st half + 2nd \
half--n=100000":  46 msecs per iteration (total: 92, iterations: 2)
PASS   : KFileItemModelBenchmark::insertAndRemoveManyItems(2nd half + 1st \
half--n=100000) RESULT : KFileItemModelBenchmark::insertAndRemoveManyItems():"2nd \
half + 1st half--n=100000":  41 msecs per iteration (total: 83, iterations: 2)
PASS   : KFileItemModelBenchmark::insertAndRemoveManyItems(even + odd--n=100000)
RESULT : KFileItemModelBenchmark::insertAndRemoveManyItems():"even + odd--n=100000":
     56.5 msecs per iteration (total: 113, iterations: 2)
PASS   : KFileItemModelBenchmark::insertAndRemoveManyItems(all - 2nd half--n=100000)
RESULT : KFileItemModelBenchmark::insertAndRemoveManyItems():"all - 2nd \
half--n=100000":  64 msecs per iteration (total: 64, iterations: 1)
PASS   : KFileItemModelBenchmark::insertAndRemoveManyItems(all - 1st half--n=100000)
RESULT : KFileItemModelBenchmark::insertAndRemoveManyItems():"all - 1st \
half--n=100000":  50.0 msecs per iteration (total: 100, iterations: 2)
PASS   : KFileItemModelBenchmark::insertAndRemoveManyItems(all - odd--n=100000)
RESULT : KFileItemModelBenchmark::insertAndRemoveManyItems():"all - odd--n=100000":
     68 msecs per iteration (total: 68, iterations: 1)
PASS   : KFileItemModelBenchmark::cleanupTestCase()
Totals: 9 passed, 0 failed, 0 skipped, 0 blacklisted
********* Finished testing of KFileItemModelBenchmark *********


Thanks,

Frank Reininghaus


--===============1751863017328764487==
MIME-Version: 1.0
Content-Type: text/html; charset="utf-8"
Content-Transfer-Encoding: 7bit




<html>
 <body>
  <div style="font-family: Verdana, Arial, Helvetica, Sans-Serif;">
   <table bgcolor="#f9f3c9" width="100%" cellpadding="12" style="border: 1px #c9c399 \
solid; border-radius: 6px; -moz-border-radius: 6px; -webkit-border-radius: 6px;">  \
<tr>  <td>
      This is an automatically generated e-mail. To reply, visit:
      <a href="https://git.reviewboard.kde.org/r/123620/">https://git.reviewboard.kde.org/r/123620/</a>
  </td>
    </tr>
   </table>
   <br />



<table bgcolor="#e0e0e0" width="100%" cellpadding="12" style="border: 1px gray solid; \
border-radius: 6px; -moz-border-radius: 6px; -webkit-border-radius: 6px;">  <tr>
  <td>
   <h1 style="margin: 0; padding: 0; font-size: 10pt;">This change has been marked as \
submitted.</h1>  </td>
 </tr>
</table>
<br />


<table bgcolor="#fefadf" width="100%" cellspacing="0" cellpadding="12" style="border: \
1px #888a85 solid; border-radius: 6px; -moz-border-radius: 6px; \
-webkit-border-radius: 6px;">  <tr>
  <td>

<div>Review request for Dolphin.</div>
<div>By Frank Reininghaus.</div>


<p style="color: grey;"><i>Updated May 7, 2015, 8:22 p.m.</i></p>



<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Changes</h1>
<table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" style="border: \
1px solid #b8b5a0">  <tr>
  <td>
   <pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: \
-moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">Submitted with commit e69d3489751ea15c0477fe1d41fcc252c775d377 by Frank \
Reininghaus to branch master.</pre>  </td>
 </tr>
</table>







<div style="margin-top: 1.5em;">
 <b style="color: #575012; font-size: 10pt;">Repository: </b>
dolphin
</div>


<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Description </h1>
 <table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" \
style="border: 1px solid #b8b5a0">  <tr>
  <td>
   <pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: \
-moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;"><p style="padding: 0;text-rendering: inherit;margin: 0;line-height: \
inherit;white-space: inherit;">I have found that the sorting performance when loading \
large folders is much worse than in the kdelibs 4.x-based Dolphin versions. The \
reason is the switch from KStringHandler::naturalCompare(...) to QCollator, which \
made creating copies of the QCollator object necessary (see \
https://git.reviewboard.kde.org/r/121817/) in order to fix crashes because \
QCollator's functions are, unlike KStringHandler::naturalCompare(...), not \
thread-safe.</p> <p style="padding: 0;text-rendering: inherit;margin: 0;line-height: \
inherit;white-space: inherit;">This patch does the following:</p> <ol style="padding: \
0;text-rendering: inherit;margin: 0 0 0 2em;line-height: inherit;white-space: \
normal;"> <li style="padding: 0;text-rendering: inherit;margin: 0;line-height: \
inherit;white-space: normal;"> <p style="padding: 0;text-rendering: inherit;margin: \
0;line-height: inherit;white-space: inherit;">Simplify the benchmarks in \
kfileitemmodelbenchmark. Currently, the benchmarks are run for many different folder \
sizes, which was motivated by the O(N^2) complexity which earlier Dolphin versions \
had if many files with some specific patterns were added. Since these issues have \
been fixed a long time ago, I think that it makes sense to run the benchmarks for a \
single folder size only. This makes it much easier to understand the benchmark \
results at first sight and identify any performance regressions.</p> </li>
<li style="padding: 0;text-rendering: inherit;margin: 0;line-height: \
inherit;white-space: normal;"> <p style="padding: 0;text-rendering: inherit;margin: \
0;line-height: inherit;white-space: inherit;">Prevent that unnecessary expensive \
copies of the QCollator object are made. A copy is only needed if a new thread is \
forked. In all other cases, the "lessThan" object, which contains the QCollator, can \
be passed by const reference. I achieved that with rather small code changes, but \
they are a bit ugly - the only way to force the std::lower_bound and std::upper_bound \
functions to take a const reference was to specify the template arguments \
explicitly.</p> </li>
</ol>
<p style="padding: 0;text-rendering: inherit;margin: 0;line-height: \
inherit;white-space: inherit;">My next plans would be to add a benchmark which \
actually benchmarks natural sorting (the current benchmark disables it because its \
intention was to test the code that inserts items into the model). This could help to \
investigate if making use of QCollator::sortKey() might speed up sorting. Moreover, I \
think it might sense to move some or all of the lessThan-related code out of the \
KFileItemModel class to make it more maintainable.</p></pre>  </td>
 </tr>
</table>


<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Testing </h1>
<table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" style="border: \
1px solid #b8b5a0">  <tr>
  <td>
   <pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: \
-moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">Unit tests still pass. I have verified (with some debug output in \
KFileItemModelLessThan::operator()) that different threads use different collator \
objects.

Benchmark results before/after this patch (with Qt, KF5 and Dolphin built in release \
mode) show that the performance is improved considerably:

Before:

********* Start testing of KFileItemModelBenchmark *********
Config: Using QtTest library 5.4.1, Qt 5.4.1 (x86_64-little_endian-lp64 shared \
(dynamic) release build; by GCC 4.8.3 20140627 [gcc-4_8-branch revision 212064]) PASS \
: KFileItemModelBenchmark::initTestCase() PASS   : \
KFileItemModelBenchmark::insertAndRemoveManyItems(all--n=100000) RESULT : \
KFileItemModelBenchmark::insertAndRemoveManyItems():&quot;all--n=100000&quot;:  2,575 \
msecs per iteration (total: 2,575, iterations: 1) PASS   : \
KFileItemModelBenchmark::insertAndRemoveManyItems(1st half + 2nd half--n=100000) \
RESULT : KFileItemModelBenchmark::insertAndRemoveManyItems():&quot;1st half + 2nd \
half--n=100000&quot;:  2,596 msecs per iteration (total: 2,596, iterations: 1)
PASS   : KFileItemModelBenchmark::insertAndRemoveManyItems(2nd half + 1st \
half--n=100000) RESULT : \
KFileItemModelBenchmark::insertAndRemoveManyItems():&quot;2nd half + 1st \
half--n=100000&quot;:  2,594 msecs per iteration (total: 2,594, iterations: 1)
PASS   : KFileItemModelBenchmark::insertAndRemoveManyItems(even + odd--n=100000)
RESULT : KFileItemModelBenchmark::insertAndRemoveManyItems():&quot;even + \
odd--n=100000&quot;:  2,609 msecs per iteration (total: 2,609, iterations: 1)
PASS   : KFileItemModelBenchmark::insertAndRemoveManyItems(all - 2nd half--n=100000)
RESULT : KFileItemModelBenchmark::insertAndRemoveManyItems():&quot;all - 2nd \
half--n=100000&quot;:  2,626 msecs per iteration (total: 2,626, iterations: 1)
PASS   : KFileItemModelBenchmark::insertAndRemoveManyItems(all - 1st half--n=100000)
RESULT : KFileItemModelBenchmark::insertAndRemoveManyItems():&quot;all - 1st \
half--n=100000&quot;:  2,610 msecs per iteration (total: 2,610, iterations: 1)
PASS   : KFileItemModelBenchmark::insertAndRemoveManyItems(all - odd--n=100000)
RESULT : KFileItemModelBenchmark::insertAndRemoveManyItems():&quot;all - \
odd--n=100000&quot;:  2,636 msecs per iteration (total: 2,636, iterations: 1)
PASS   : KFileItemModelBenchmark::cleanupTestCase()
Totals: 9 passed, 0 failed, 0 skipped, 0 blacklisted
********* Finished testing of KFileItemModelBenchmark *********


After:

********* Start testing of KFileItemModelBenchmark *********
Config: Using QtTest library 5.4.1, Qt 5.4.1 (x86_64-little_endian-lp64 shared \
(dynamic) release build; by GCC 4.8.3 20140627 [gcc-4_8-branch revision 212064]) PASS \
: KFileItemModelBenchmark::initTestCase() PASS   : \
KFileItemModelBenchmark::insertAndRemoveManyItems(all--n=100000) RESULT : \
KFileItemModelBenchmark::insertAndRemoveManyItems():&quot;all--n=100000&quot;:  21 \
msecs per iteration (total: 87, iterations: 4) PASS   : \
KFileItemModelBenchmark::insertAndRemoveManyItems(1st half + 2nd half--n=100000) \
RESULT : KFileItemModelBenchmark::insertAndRemoveManyItems():&quot;1st half + 2nd \
half--n=100000&quot;:  46 msecs per iteration (total: 92, iterations: 2)
PASS   : KFileItemModelBenchmark::insertAndRemoveManyItems(2nd half + 1st \
half--n=100000) RESULT : \
KFileItemModelBenchmark::insertAndRemoveManyItems():&quot;2nd half + 1st \
half--n=100000&quot;:  41 msecs per iteration (total: 83, iterations: 2)
PASS   : KFileItemModelBenchmark::insertAndRemoveManyItems(even + odd--n=100000)
RESULT : KFileItemModelBenchmark::insertAndRemoveManyItems():&quot;even + \
odd--n=100000&quot;:  56.5 msecs per iteration (total: 113, iterations: 2)
PASS   : KFileItemModelBenchmark::insertAndRemoveManyItems(all - 2nd half--n=100000)
RESULT : KFileItemModelBenchmark::insertAndRemoveManyItems():&quot;all - 2nd \
half--n=100000&quot;:  64 msecs per iteration (total: 64, iterations: 1)
PASS   : KFileItemModelBenchmark::insertAndRemoveManyItems(all - 1st half--n=100000)
RESULT : KFileItemModelBenchmark::insertAndRemoveManyItems():&quot;all - 1st \
half--n=100000&quot;:  50.0 msecs per iteration (total: 100, iterations: 2)
PASS   : KFileItemModelBenchmark::insertAndRemoveManyItems(all - odd--n=100000)
RESULT : KFileItemModelBenchmark::insertAndRemoveManyItems():&quot;all - \
odd--n=100000&quot;:  68 msecs per iteration (total: 68, iterations: 1)
PASS   : KFileItemModelBenchmark::cleanupTestCase()
Totals: 9 passed, 0 failed, 0 skipped, 0 blacklisted
********* Finished testing of KFileItemModelBenchmark *********</pre>
  </td>
 </tr>
</table>


<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Diffs</b> </h1>
<ul style="margin-left: 3em; padding-left: 0;">

 <li>src/kitemviews/private/kfileitemmodelsortalgorithm.h <span style="color: \
grey">(50db990)</span></li>

 <li>src/tests/kfileitemmodelbenchmark.cpp <span style="color: \
grey">(3ff0405)</span></li>

</ul>

<p><a href="https://git.reviewboard.kde.org/r/123620/diff/" style="margin-left: \
3em;">View Diff</a></p>






  </td>
 </tr>
</table>



  </div>
 </body>
</html>


--===============1751863017328764487==--


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic