From kfm-devel  Thu Sep 19 22:32:08 2013
From: Frank Reininghaus <frank78ac () googlemail ! com>
Date: Thu, 19 Sep 2013 22:32:08 +0000
To: kfm-devel
Subject: Re: UDSEntry compression ideas
Message-Id: <CAFoZWWgXOmgKjPjSz7Vg_qg+hCp-wuj54SSzykYkdxevzWn5fg () mail ! gmail ! com>
X-MARC-Message: https://marc.info/?l=kfm-devel&m=137962995805432

Hi,

2013/9/17 Frank Reininghaus:
> Hi Mark,
>
> I'm too tired for a longer reply already, but here is a quick idea
> that I just had and tried immediately.
>
> 2013/9/16 Mark:
>> Hi All,
>>
>> please read this as a brainstorm post with some ideas. I have no code
>> working in this area yet.
>>
>> Lazy loading is awesome, but if you can compress the data enough that you
>> still have all details without lazy loading is even better i think. For
>> example, doing a directory listing on any given folder gives at least the
>> following UDSEntry values:
>> - Device ID
>> - Inode
>> - User
>> - Group
>>
>> In the "experimental" case where you list the folder contents that has just
>> a bunch of files created by one user you can say that at least the above 4
>> properties are the same for all files. However, the experimental situation
>> certainly doesn't work on "real data" where you can have files from multiple
>> users and folders as well.
>
> Well, I guess that most people have quite a few folders where the
> "User" and "Group" are the same for all files ;-)
>
> The easiest "compression" that I could think of was to force the
> UDSEntries to implicitly share all their QStrings for one field if
> they are all equal:
>
> http://paste.kde.org/p52a24b49/
>
> Reduces the memory consumption in Details View for 100,000 items from
> 160.5 MB to 147 MB here. The patch is a bit hackish though, and it
> might have a small performance penalty. Maybe it's worth investigating
> that further though.

OK, so I did some performance measurements. Short version: my earlier
patch does make listing directories slightly slower, but I found
another way to hack UDSEntry that does apparently not make things
slower, and that reduces the memory usage by about 450 bytes per
listed file, without any loss of functionality.

Long version:

(a) First of all, I took the current master branches of kdelibs and
kde-baseapps (all built in release mode) and added some profiling
output:

http://paste.kde.org/p9b9b7b68/
http://paste.kde.org/p992af44c/

Then I opened a folder with 100,000 files in Dolphin in Icons View.

According to KSysGuard, the memory usage of the dolphin process is 200.6 MB.

For the performance measurements, I discarded the first run and went
out of the folder and back twice to get reproducible results (the time
spent per item in the first run fluctuated strongly).

The time that SlaveInterface::dispatch(int _cmd, const QByteArray
&rawdata) needs per item was between 2.6 and 2.8 microseconds.


(b) My first patch that only makes use of implicit sharing of the QStrings:

http://paste.kde.org/p52a24b49/

Memory usage drops to 187.9 MB, but the time spent per item increases
to 2.9 to 3.1 microseconds (not surprising, I think).


(c) My new patch, that makes more intrusive changes to the way the
data is stored inside UDSEntry (explanations below):

http://paste.kde.org/pc85e6b34/

Memory usage is now 156.8 MB, and the time per item 2.6 to 2.9 microseconds.


This means that we can save more than 450 bytes for each file, which
is IMHO a lot.



How does it work?

The basic observation is that UDSEntry currently stores all data in a
QHash<uint, Field>, and that a QHash is not really space-efficient
(see, e.g., http://doc.qt.digia.com/qq/qq19-containers.html to get an
idea how much memory is needed to store the data inside a QHash).
However, it has the nice property that the "Field" that an "uint" is
mapped to can be accessed in O(1) time.

My idea to improve this and keep the O(1) access is the following: We
store all fields in UDSEntryPrivate in a simple

QVector<Field> fields

in the order in which the fields are read from the stream. We remember
which UDS field comes at which position and store this in a

QHash<uint, int> udsIndexes

(uint is the type of the "UDS field", and a QVector uses "int" for
indexing, but there are still a few signed/unsigned warnings with my
patch which need investigation).

Then we can access a "Field" that corresponds to a uint "uds" with

d->fields.at(d->udsIndexes.value(field)).

Now the obvious question is: "How on earth can this be more efficient
that the current solution? There is still a QHash in each
UDSEntryPrivate after all, and the code gets more complicated!"

The trick is that the mapping from the "uint uds" to the index of the
corresponding "Field" is usually exactly the same for every UDSEntry
if we read many of them in a row, and we can make all UDSEntryPrivates
share this mapping (QHash is implicitly shared). This means that every
UDSEntryPrivate only has a private QVector<Field>, which only needs a
rather small overhead compared to the memory that the Fields
themselves need.

The patch definitely needs further testing and polishing. I probably
won't get much work done until next week though.

Best regards,
Frank