[prev in list] [next in list] [prev in thread] [next in thread]
List: kde-frameworks-devel
Subject: Re: Review Request 115739: Make UDSEntry a Q_MOVABLE type, and add some benchmarks and tests
From: "Frank Reininghaus" <frank78ac () googlemail ! com>
Date: 2014-02-16 8:31:26
Message-ID: 20140216083126.25204.34499 () probe ! kde ! org
[Download RAW message or body]
[Attachment #2 (multipart/alternative)]
> On Feb. 15, 2014, 8:28 p.m., David Faure wrote:
> > Yeah the description is wrong. Making something Q_MOVABLE means that inserting into the list, or the \
> > copying that happens when reallocating for more space, will be able to memmove() instead of \
> > copy-constructing items. This doesn't change the actual memory layout (which only depends on the size \
> > of the items).
> > Basically, as long as UDSEntry doesn't have a "q" pointer in its private class (which it doesn't), \
> > it's movable. So marking it as movable is correct, but not with this commit message.
> > As to a difference in performance, I would be surprised if it was measurable. The copy ctor for \
> > UDSEntry plays with a refcount.
"Yeah the description is wrong. Making something Q_MOVABLE means that inserting into the list, or the \
copying that happens when reallocating for more space, will be able to memmove() instead of \
copy-constructing items. This doesn't change the actual memory layout (which only depends on the size of \
the items)"
I am afraid this is wrong. A QList<T> is essentially a small wrapper around a QList<void*>, which \
*always* uses memmove() to move around its items. If sizeof(T) <= sizeof(void*), and it's known that \
using memmove() is safe for T (e.g. because it's POD or Qt itself declares it as Q_MOVABLE), then QList \
just reinterprets each void* as an item of type T.
If that is not the case, then QList<T> will effectively become a QList<T*>, and allocate memory for each \
item individually on the heap.
So yes, making something Q_MOVABLE definitely changes the actual memory layout.
Anyone who does not believe me is encouraged to have a quick look at these excellent posts by Marc Mutz:
http://marcmutz.wordpress.com/2010/07/29/sneak-preview-qlist-considered-harmful/
http://marcmutz.wordpress.com/effective-qt/containers/
or look at the source,
http://code.woboq.org/qt5/qtbase/src/corelib/tools/qlist.h.html
Note that void QList<T>::append(const T &t) calls the function
void QList<T>::node_construct(Node *n, const T &t)
which does
n->v = new T(t)
if QTypeInfo<T>::isLarge (which means that T is larger than a void*) or QTypeInfo<T>::isStatic (which \
means that T has not been declared as Q_MOVABLE). This is all explained in Marc's second post.
"As to a difference in performance, I would be surprised if it was measurable. The copy ctor for UDSEntry \
plays with a refcount."
My point is *not* that this change saves us the refcount updates when items are moved around.
What is saves is that a small block (including the overhead that the memory allocator adds to each \
allocatin, usually 32 bytes) of memory is allocated/deallocated every time an item is added to/removed \
from the list.
This might still not affect the performance noticeably in many cases, but allocating memory is a lot more \
expensive than just increasing a refcount.
- Frank
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/115739/#review49873
-----------------------------------------------------------
On Feb. 13, 2014, 8:23 p.m., Frank Reininghaus wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/115739/
> -----------------------------------------------------------
>
> (Updated Feb. 13, 2014, 8:23 p.m.)
>
>
> Review request for KDE Frameworks and David Faure.
>
>
> Repository: kio
>
>
> Description
> -------
>
> I'm continuing my efforts to make UDSEntry more efficient, which were started in \
> https://git.reviewboard.kde.org/r/113591/. This is the second step, and I'll probably do more in the \
> future, for which I will split https://git.reviewboard.kde.org/r/113355/ into independent parts.
> This patch contains the following changes (which are separate commits):
>
> 1. Make UDSEntry a Q_MOVABLE type. This has the effect that a QList<UDSEntry> a.k.a. UDSEntryList will \
> store the actual entries in a single allocated block of memory, and not pointers to UDSEntries which \
> are allocated individually on the heap (this means that this change is binary incompatible). This \
> reduces the memory usage by 32 bytes per UDSEntry in a QList because each memory allocation uses at \
> least 32 bytes on a 64-bit system.
> This idea is similar to what I proposed for KFileItem in https://git.reviewboard.kde.org/r/111789/. \
> That one had to be reverted later though because it turned out that KDirLister does rely on \
> QList<KFileItem> storing only pointers to the KFileItems. I'm confident that no such trouble will \
> happen for UDSEntry - all KIO unit tests still pass.
> One could argue that we could simply use a UDSEntryVector instead of a UDSEntyList to achieve the same \
> memory savings, but considering that the list is used quite frequently ( \
> http://lxr.kde.org/ident?i=UDSEntryList ), this might require a lot of porting work and cause other \
> unexpected problems.
> Note that I could not make the Q_DECLARE_TYPEINFO declaration work if it was inside the KIO namespace, \
> but I still preferred to do it immediately after the class declaration, so I had to interrupt the \
> namespace briefly.
> 2. Add some benchmarks to measure how long typical UDSEntry operations take: add fields to an entry, \
> read fields from an entry, store a UDSEntryList in a QByteArray, and read it back from the QByteArray.
> All measurements are done both for UDSEntries with 8 fields (this is a rather typical use case because \
> kio_file usually creates 8 fields), and for entries with the maximum number of fields.
> 3. Add a simple manual test ('listjobtest') that runs a KIO::ListJob for a given URL. This can be used \
> to easily measure the memory usage of the UDSEntryList that contains an entry for each file at that \
> URL.
>
> Diffs
> -----
>
> src/core/udsentry.h 9550a7e
> tests/CMakeLists.txt dbca6a5
> tests/listjobtest.cpp PRE-CREATION
> tests/udsentrybenchmark.cpp PRE-CREATION
>
> Diff: https://git.reviewboard.kde.org/r/115739/diff/
>
>
> Testing
> -------
>
> 1. KIO unit tests still pass.
>
> 2. I've confirmed with the new 'listjobtest' and KSysGuard that the memory usage is really lowered by \
> 32 bytes per item in a UDSEntryList.
> 3. The benchmark results do not change significantly. I only tested it with a debug build (i.e., with \
> optimizations disabled) though, and I'm afraid I might be lacking the resources to set up an additional \
> build of Qt5 and all of KF5 in release mode. However, since UDSEntry essentially only depends on \
> qtbase, I might be able to just do a release build of qtbase and build a stand-alone version of \
> UDSEntry+benchmarks on top of that. I'll look into this option for getting reliable benchmark results \
> when I work on further improvements in the future.
>
> Thanks,
>
> Frank Reininghaus
>
>
[Attachment #5 (text/html)]
<html>
<body>
<div style="font-family: Verdana, Arial, Helvetica, Sans-Serif;">
<table bgcolor="#f9f3c9" width="100%" cellpadding="8" style="border: 1px #c9c399 solid;">
<tr>
<td>
This is an automatically generated e-mail. To reply, visit:
<a href="https://git.reviewboard.kde.org/r/115739/">https://git.reviewboard.kde.org/r/115739/</a>
</td>
</tr>
</table>
<br />
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">
<p style="margin-top: 0;">On February 15th, 2014, 8:28 p.m. UTC, <b>David Faure</b> wrote:</p>
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: 10px;">
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: \
-o-pre-wrap; word-wrap: break-word;">Yeah the description is wrong. Making something Q_MOVABLE means that \
inserting into the list, or the copying that happens when reallocating for more space, will be able to \
memmove() instead of copy-constructing items. This doesn't change the actual memory layout (which \
only depends on the size of the items).
Basically, as long as UDSEntry doesn't have a "q" pointer in its private class (which it \
doesn't), it's movable. So marking it as movable is correct, but not with this commit message.
As to a difference in performance, I would be surprised if it was measurable. The copy ctor for UDSEntry \
plays with a refcount.</pre> </blockquote>
</blockquote>
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: -pre-wrap; white-space: \
-o-pre-wrap; word-wrap: break-word;">"Yeah the description is wrong. Making something Q_MOVABLE \
means that inserting into the list, or the copying that happens when reallocating for more space, will be \
able to memmove() instead of copy-constructing items. This doesn't change the actual memory layout \
(which only depends on the size of the items)"
I am afraid this is wrong. A QList<T> is essentially a small wrapper around a QList<void*>, \
which *always* uses memmove() to move around its items. If sizeof(T) <= sizeof(void*), and it's \
known that using memmove() is safe for T (e.g. because it's POD or Qt itself declares it as \
Q_MOVABLE), then QList just reinterprets each void* as an item of type T.
If that is not the case, then QList<T> will effectively become a QList<T*>, and allocate \
memory for each item individually on the heap.
So yes, making something Q_MOVABLE definitely changes the actual memory layout.
Anyone who does not believe me is encouraged to have a quick look at these excellent posts by Marc Mutz:
http://marcmutz.wordpress.com/2010/07/29/sneak-preview-qlist-considered-harmful/
http://marcmutz.wordpress.com/effective-qt/containers/
or look at the source,
http://code.woboq.org/qt5/qtbase/src/corelib/tools/qlist.h.html
Note that void QList<T>::append(const T &t) calls the function
void QList<T>::node_construct(Node *n, const T &t)
which does
n->v = new T(t)
if QTypeInfo<T>::isLarge (which means that T is larger than a void*) or \
QTypeInfo<T>::isStatic (which means that T has not been declared as Q_MOVABLE). This is all \
explained in Marc's second post.
"As to a difference in performance, I would be surprised if it was measurable. The copy ctor for \
UDSEntry plays with a refcount."
My point is *not* that this change saves us the refcount updates when items are moved around.
What is saves is that a small block (including the overhead that the memory allocator adds to each \
allocatin, usually 32 bytes) of memory is allocated/deallocated every time an item is added to/removed \
from the list.
This might still not affect the performance noticeably in many cases, but allocating memory is a lot more \
expensive than just increasing a refcount.</pre> <br />
<p>- Frank</p>
<br />
<p>On February 13th, 2014, 8:23 p.m. UTC, Frank Reininghaus wrote:</p>
<table bgcolor="#fefadf" width="100%" cellspacing="0" cellpadding="8" style="background-image: \
url('https://git.reviewboard.kde.org/static/rb/images/review_request_box_top_bg.ab6f3b1072c9.png'); \
background-position: left top; background-repeat: repeat-x; border: 1px black solid;"> <tr>
<td>
<div>Review request for KDE Frameworks and David Faure.</div>
<div>By Frank Reininghaus.</div>
<p style="color: grey;"><i>Updated Feb. 13, 2014, 8:23 p.m.</i></p>
<div style="margin-top: 1.5em;">
<b style="color: #575012; font-size: 10pt;">Repository: </b>
kio
</div>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Description </h1>
<table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" style="border: 1px solid \
#b8b5a0"> <tr>
<td>
<pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: \
-pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">I'm continuing my efforts to make \
UDSEntry more efficient, which were started in https://git.reviewboard.kde.org/r/113591/. This is the \
second step, and I'll probably do more in the future, for which I will split \
https://git.reviewboard.kde.org/r/113355/ into independent parts.
This patch contains the following changes (which are separate commits):
1. Make UDSEntry a Q_MOVABLE type. This has the effect that a QList<UDSEntry> a.k.a. UDSEntryList \
will store the actual entries in a single allocated block of memory, and not pointers to UDSEntries which \
are allocated individually on the heap (this means that this change is binary incompatible). This reduces \
the memory usage by 32 bytes per UDSEntry in a QList because each memory allocation uses at least 32 \
bytes on a 64-bit system.
This idea is similar to what I proposed for KFileItem in https://git.reviewboard.kde.org/r/111789/. That \
one had to be reverted later though because it turned out that KDirLister does rely on \
QList<KFileItem> storing only pointers to the KFileItems. I'm confident that no such trouble \
will happen for UDSEntry - all KIO unit tests still pass.
One could argue that we could simply use a UDSEntryVector instead of a UDSEntyList to achieve the same \
memory savings, but considering that the list is used quite frequently ( \
http://lxr.kde.org/ident?i=UDSEntryList ), this might require a lot of porting work and cause other \
unexpected problems.
Note that I could not make the Q_DECLARE_TYPEINFO declaration work if it was inside the KIO namespace, \
but I still preferred to do it immediately after the class declaration, so I had to interrupt the \
namespace briefly.
2. Add some benchmarks to measure how long typical UDSEntry operations take: add fields to an entry, read \
fields from an entry, store a UDSEntryList in a QByteArray, and read it back from the QByteArray.
All measurements are done both for UDSEntries with 8 fields (this is a rather typical use case because \
kio_file usually creates 8 fields), and for entries with the maximum number of fields.
3. Add a simple manual test ('listjobtest') that runs a KIO::ListJob for a given URL. This can be \
used to easily measure the memory usage of the UDSEntryList that contains an entry for each file at that \
URL. </pre>
</td>
</tr>
</table>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Testing </h1>
<table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" style="border: 1px solid #b8b5a0">
<tr>
<td>
<pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: \
-pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">1. KIO unit tests still pass.
2. I've confirmed with the new 'listjobtest' and KSysGuard that the memory usage is really \
lowered by 32 bytes per item in a UDSEntryList.
3. The benchmark results do not change significantly. I only tested it with a debug build (i.e., with \
optimizations disabled) though, and I'm afraid I might be lacking the resources to set up an \
additional build of Qt5 and all of KF5 in release mode. However, since UDSEntry essentially only depends \
on qtbase, I might be able to just do a release build of qtbase and build a stand-alone version of \
UDSEntry+benchmarks on top of that. I'll look into this option for getting reliable benchmark results \
when I work on further improvements in the future.</pre> </td>
</tr>
</table>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Diffs</b> </h1>
<ul style="margin-left: 3em; padding-left: 0;">
<li>src/core/udsentry.h <span style="color: grey">(9550a7e)</span></li>
<li>tests/CMakeLists.txt <span style="color: grey">(dbca6a5)</span></li>
<li>tests/listjobtest.cpp <span style="color: grey">(PRE-CREATION)</span></li>
<li>tests/udsentrybenchmark.cpp <span style="color: grey">(PRE-CREATION)</span></li>
</ul>
<p><a href="https://git.reviewboard.kde.org/r/115739/diff/" style="margin-left: 3em;">View Diff</a></p>
</td>
</tr>
</table>
</div>
</body>
</html>
_______________________________________________
Kde-frameworks-devel mailing list
Kde-frameworks-devel@kde.org
https://mail.kde.org/mailman/listinfo/kde-frameworks-devel
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic