On Fri, Sep 20, 2013 at 12:32 AM, Frank Reininghaus <frank78ac@googlemail.com> wrote:

Hi,

2013/9/17 Frank Reininghaus:

> Hi Mark,
>
> I'm too tired for a longer reply already, but here is a quick idea
> that I just had and tried immediately.
>
> 2013/9/16 Mark:
>> Hi All,
>>
>> please read this as a brainstorm post with some ideas. I have no code
>> working in this area yet.
>>
>> Lazy loading is awesome, but if you can compress the data enough that you
>> still have all details without lazy loading is even better i think. For
>> example, doing a directory listing on any given folder gives at least the
>> following UDSEntry values:
>> - Device ID
>> - Inode
>> - User
>> - Group
>>
>> In the "experimental" case where you list the folder contents that has just
>> a bunch of files created by one user you can say that at least the above 4
>> properties are the same for all files. However, the experimental situation
>> certainly doesn't work on "real data" where you can have files from multiple
>> users and folders as well.
>
> Well, I guess that most people have quite a few folders where the
> "User" and "Group" are the same for all files ;-)
>
> The easiest "compression" that I could think of was to force the
> UDSEntries to implicitly share all their QStrings for one field if
> they are all equal:
>
> http://paste.kde.org/p52a24b49/
>
> Reduces the memory consumption in Details View for 100,000 items from
> 160.5 MB to 147 MB here. The patch is a bit hackish though, and it
> might have a small performance penalty. Maybe it's worth investigating
> that further though.

OK, so I did some performance measurements. Short version: my earlier
patch does make listing directories slightly slower, but I found
another way to hack UDSEntry that does apparently not make things
slower, and that reduces the memory usage by about 450 bytes per
listed file, without any loss of functionality.

Long version:

(a) First of all, I took the current master branches of kdelibs and
kde-baseapps (all built in release mode) and added some profiling
output:

http://paste.kde.org/p9b9b7b68/
http://paste.kde.org/p992af44c/

Then I opened a folder with 100,000 files in Dolphin in Icons View.

According to KSysGuard, the memory usage of the dolphin process is 200.6 MB.

For the performance measurements, I discarded the first run and went
out of the folder and back twice to get reproducible results (the time
spent per item in the first run fluctuated strongly).

The time that SlaveInterface::dispatch(int _cmd, const QByteArray
&rawdata) needs per item was between 2.6 and 2.8 microseconds.

(b) My first patch that only makes use of implicit sharing of the QStrings:

http://paste.kde.org/p52a24b49/

Memory usage drops to 187.9 MB, but the time spent per item increases
to 2.9 to 3.1 microseconds (not surprising, I think).

(c) My new patch, that makes more intrusive changes to the way the
data is stored inside UDSEntry (explanations below):

http://paste.kde.org/pc85e6b34/

Memory usage is now 156.8 MB, and the time per item 2.6 to 2.9 microseconds.

This means that we can save more than 450 bytes for each file, which
is IMHO a lot.

How does it work?

The basic observation is that UDSEntry currently stores all data in a
QHash<uint, Field>, and that a QHash is not really space-efficient
(see, e.g., http://doc.qt.digia.com/qq/qq19-containers.html to get an
idea how much memory is needed to store the data inside a QHash).
However, it has the nice property that the "Field" that an "uint" is
mapped to can be accessed in O(1) time.

My idea to improve this and keep the O(1) access is the following: We
store all fields in UDSEntryPrivate in a simple

QVector<Field> fields

in the order in which the fields are read from the stream. We remember
which UDS field comes at which position and store this in a

QHash<uint, int> udsIndexes

(uint is the type of the "UDS field", and a QVector uses "int" for
indexing, but there are still a few signed/unsigned warnings with my
patch which need investigation).

Then we can access a "Field" that corresponds to a uint "uds" with

d->fields.at(d->udsIndexes.value(field)).

Now the obvious question is: "How on earth can this be more efficient
that the current solution? There is still a QHash in each
UDSEntryPrivate after all, and the code gets more complicated!"

The trick is that the mapping from the "uint uds" to the index of the
corresponding "Field" is usually exactly the same for every UDSEntry
if we read many of them in a row, and we can make all UDSEntryPrivates
share this mapping (QHash is implicitly shared). This means that every
UDSEntryPrivate only has a private QVector<Field>, which only needs a
rather small overhead compared to the memory that the Fields
themselves need.

The patch definitely needs further testing and polishing. I probably
won't get much work done until next week though.

Best regards,
Frank

Hi Frank,

That is really stunning! I sadly don't have a proper testcase to test it on frameworks.

So i made a very quick rough test app that simply stores all entries in memory.

Doing that i get these very impressive results:

PRE:

500.000 files: 236,208MB

POST (Frank):

500.000 files: 118,148MB

You can remove another ~14MB in both cases which the app just has with other stuff going one.

Hoewver, looking at your new patch i can't help but notice that the hash is now only used as a lookup table. Perhaps you can just get rid of that now and simply loop through the vector to find the right "Field" object? You probably have to add a uint field to the Field struct in that case otherwise you wouldn't know the field key. I guess it would take a bit more CPU power since you would have to loop through all of QVector to find a falue. On the other hand, you just have to loop over 8 items.. Perhaps that time is even faster then a hash lookup even if that's O(1), it still has to do the hash calculation some other stuff as opposed to just iterating over a vector and comparing if the uint key (an int!) matches.

Btw. This stuff just shows again how freaking smart you are and how much i still have to learn in programming :) I guess it's a good thing that i ordered this book: "Data Structures and Algorithm Analysis in C++ (4th edition)" : http://users.cis.fiu.edu/~weiss/