From kfm-devel Thu Sep 19 22:32:08 2013 From: Frank Reininghaus Date: Thu, 19 Sep 2013 22:32:08 +0000 To: kfm-devel Subject: Re: UDSEntry compression ideas Message-Id: X-MARC-Message: https://marc.info/?l=kfm-devel&m=137962995805432 Hi, 2013/9/17 Frank Reininghaus: > Hi Mark, > > I'm too tired for a longer reply already, but here is a quick idea > that I just had and tried immediately. > > 2013/9/16 Mark: >> Hi All, >> >> please read this as a brainstorm post with some ideas. I have no code >> working in this area yet. >> >> Lazy loading is awesome, but if you can compress the data enough that you >> still have all details without lazy loading is even better i think. For >> example, doing a directory listing on any given folder gives at least the >> following UDSEntry values: >> - Device ID >> - Inode >> - User >> - Group >> >> In the "experimental" case where you list the folder contents that has just >> a bunch of files created by one user you can say that at least the above 4 >> properties are the same for all files. However, the experimental situation >> certainly doesn't work on "real data" where you can have files from multiple >> users and folders as well. > > Well, I guess that most people have quite a few folders where the > "User" and "Group" are the same for all files ;-) > > The easiest "compression" that I could think of was to force the > UDSEntries to implicitly share all their QStrings for one field if > they are all equal: > > http://paste.kde.org/p52a24b49/ > > Reduces the memory consumption in Details View for 100,000 items from > 160.5 MB to 147 MB here. The patch is a bit hackish though, and it > might have a small performance penalty. Maybe it's worth investigating > that further though. OK, so I did some performance measurements. Short version: my earlier patch does make listing directories slightly slower, but I found another way to hack UDSEntry that does apparently not make things slower, and that reduces the memory usage by about 450 bytes per listed file, without any loss of functionality. Long version: (a) First of all, I took the current master branches of kdelibs and kde-baseapps (all built in release mode) and added some profiling output: http://paste.kde.org/p9b9b7b68/ http://paste.kde.org/p992af44c/ Then I opened a folder with 100,000 files in Dolphin in Icons View. According to KSysGuard, the memory usage of the dolphin process is 200.6 MB. For the performance measurements, I discarded the first run and went out of the folder and back twice to get reproducible results (the time spent per item in the first run fluctuated strongly). The time that SlaveInterface::dispatch(int _cmd, const QByteArray &rawdata) needs per item was between 2.6 and 2.8 microseconds. (b) My first patch that only makes use of implicit sharing of the QStrings: http://paste.kde.org/p52a24b49/ Memory usage drops to 187.9 MB, but the time spent per item increases to 2.9 to 3.1 microseconds (not surprising, I think). (c) My new patch, that makes more intrusive changes to the way the data is stored inside UDSEntry (explanations below): http://paste.kde.org/pc85e6b34/ Memory usage is now 156.8 MB, and the time per item 2.6 to 2.9 microseconds. This means that we can save more than 450 bytes for each file, which is IMHO a lot. How does it work? The basic observation is that UDSEntry currently stores all data in a QHash, and that a QHash is not really space-efficient (see, e.g., http://doc.qt.digia.com/qq/qq19-containers.html to get an idea how much memory is needed to store the data inside a QHash). However, it has the nice property that the "Field" that an "uint" is mapped to can be accessed in O(1) time. My idea to improve this and keep the O(1) access is the following: We store all fields in UDSEntryPrivate in a simple QVector fields in the order in which the fields are read from the stream. We remember which UDS field comes at which position and store this in a QHash udsIndexes (uint is the type of the "UDS field", and a QVector uses "int" for indexing, but there are still a few signed/unsigned warnings with my patch which need investigation). Then we can access a "Field" that corresponds to a uint "uds" with d->fields.at(d->udsIndexes.value(field)). Now the obvious question is: "How on earth can this be more efficient that the current solution? There is still a QHash in each UDSEntryPrivate after all, and the code gets more complicated!" The trick is that the mapping from the "uint uds" to the index of the corresponding "Field" is usually exactly the same for every UDSEntry if we read many of them in a row, and we can make all UDSEntryPrivates share this mapping (QHash is implicitly shared). This means that every UDSEntryPrivate only has a private QVector, which only needs a rather small overhead compared to the memory that the Fields themselves need. The patch definitely needs further testing and polishing. I probably won't get much work done until next week though. Best regards, Frank