From kfm-devel Tue Oct 23 12:38:43 2012 From: Mark Date: Tue, 23 Oct 2012 12:38:43 +0000 To: kfm-devel Subject: Ideas for improving sorting. Goal: under 1 second for 200.000 files. Message-Id: X-MARC-Message: https://marc.info/?l=kfm-devel&m=135099596318315 Hi, Right now we have 2 bug reports for sorting and suggestions/patches on how to improve them. [1, 2] I tried a lot already to improve sorting and made it twice as fast in my local experiments [2]. The patch for that is certainly unacceptable and very very hacky! Yesterday i just sat down (had to since i was having a bus ride anyway :p) and just think about the actual issue and how it might be resolved. Doing that i came to a whole different conclusion then before. It's only theoretical at the moment, but i think this might actually work and greatly speed things up. 1. Splitting out sorting for folders and the rest ------------------- This one seemingly simple makes a noticeable dent in the time it takes to sort. The lessThan function [3] (line 1320) is called for the sorting algorithm in O(n log n). Optimizing anything in here will result in a faster sort. File and Folder sorting is split out in "specialized" lessThan functions (lessThanFolder and lessThanFiles) then you can already prevent the "if (m_sortDirsFirst || m_sortRole == SizeRole) {" construction. I've tried this and it speeds up the sortingquite a bit. It still remains slow though. 2. Copy KStringHandler::naturalCompare into dolphin for further in depth optimizations 2.1 Unicode cache for isNull, isPunct, isSpace and is isDigit. ------------------- KStrinHandler::naturalCompare is - by far - the biggest resource user. It needs improvements and thus far i simply didn't know how to do that. Yesterday i figured out a bunch of optimizations for this. What i;d do is make a structure (class): class CharUnicodeCache { public: bool isSpace; bool isPunct; bool isNull; bool isDigit; } Putting that in a list: QList m_chatUnicodeCache; Then in the naturalCompare it needs to fetch the cached values from the list rather then using the function calls. I'm not quite sure about this optimization since it's quite tricky and "might" not improve performance that much. Note: the key is the numeric representation of the hex value of the unicode character thus the lookup is indeed in constant time and should be very rapid. 2.2 Get rid of localeAwareCompare - kind of ------------------- localeAwareCompare is - by far - the biggest resource user in naturalCompare so one way of making it a lot faster is by getting rid of it all together :) Sadly that's not entirely possible, but we can use the stuff that Qt uses to make it faster and prevent string allocations. To follow this, here is the stack when localeAwareCompare is called: - localeAwareCompare - - localeAwareCompare_helper - - - ... system dependent stuff. if nothing is found: ucstrcmp Lets assume we go through ucstrcmp [4] - - - - ucstrncmp [5] The deepest functions for localeAwareCompare are ucstrcmp and ucstrncmp. We can simply copy those 2 functions and make "our own" localeAwareCompare that just passes around QChar pointers without allocating anything. Doing this is likely to greatly speedup the new "localeAwareCompare". Note: the naturalCompare has a comment that complains about insensitive comparison and having to do toLower to get around that. If we copy the Qt functions we might as well copy the case insensitive version [6] and [7] which gets rig of the toLower which is also quite high on the function call cost list. 2.3 Pass around QChar* and get rid of QString ------------------- Right now naturalCompare takes two QString arguments while it internally grabs a QChar* from the string. I'm not sure here, but i think you save some time if you prevent QString passing around and simply pass around QChar* then i imagine that you have a lot less unicode since the string arrays are already in unicode. That might be another (small?) speed improvement i guess. 3. Multithreaded sorting ------------------- I don't know if the current sorting algorithm is fit for multithreading usage, but the issue we have in sorting is CPU bound. If we add in threading it should speed up significantly. Right now i'm implementing 2.1 in a local branch and 2.2 is next on my list. I expect 2.2 to give the biggest speedboost. I will report my findings in this thread once i have something. It would be best if this sorting is greatly improved and added to Dolphin's codebase. The above suggestions do add quite a bit of code to dolphin, but it's all fairly self contained. I hope it's also low maintenance, but that remains to be seen :) If someone else has more suggestions/ideas on improving sorting, please do tell. Having sorting done under 1 second on my CPU (AMD 1100XT) is my goal for now. But i think it can be even faster then that. [1] https://bugs.kde.org/show_bug.cgi?id=306290 [2] https://bugs.kde.org/show_bug.cgi?id=307585 [3] http://quickgit.kde.org/index.php?p=kde-baseapps.git&a=blob&h=61f512a8e5505a110a351525ee6592e8b2093f96&hb=b4c01c00a91bc1a206c9577adca2462845699ff9&f=dolphin%2Fsrc%2Fkitemviews%2Fkfileitemmodel.cpp [4] http://qt.gitorious.org/qt/qtbase/blobs/master/src/corelib/tools/qstring.cpp#line202 [5] http://qt.gitorious.org/qt/qtbase/blobs/master/src/corelib/tools/qstring.cpp#line192 [6] http://qt.gitorious.org/qt/qtbase/blobs/master/src/corelib/tools/qstring.cpp#line128 [7] http://qt.gitorious.org/qt/qtbase/blobs/master/src/corelib/tools/qstring.cpp#line212