From kde-core-devel Fri Aug 02 13:49:18 2013 From: Mark Date: Fri, 02 Aug 2013 13:49:18 +0000 To: kde-core-devel Subject: Re: KDirWatch bug and the analysis. Help is welcome! Message-Id: X-MARC-Message: https://marc.info/?l=kde-core-devel&m=137545148631772 On Fri, Aug 2, 2013 at 3:41 PM, David Faure wrote: > On Friday 02 August 2013 15:07:19 Mark wrote: >> Just for my view on this. What's the difference between stat() and >> kio::stat()? > > Err, you know the answer is in the docs, right? > stat() = unix-only, local files only, sync > KIO::stat() = portable, network-transparent, async > >> >> Or follow your initial approach and not let KDirWatch do these stat >> >> calls for new files and just re-list. >> > >> > You're confusing two layers. KDirLister[Cache] is the one doing the re- >> > listing. But that's one layer above KDirWatch (which has other users than >> > KDirLister). >> >> I actually wrote it down poorly. KDirWatch is sending a signal on >> which KDirLister responds and re-lists the whole folder. > > Right. Which means the re-listing is unrelated to what KDirWatch has to do > before emitting the signal. > >> Funny thing is that KDirWatch already has the proper API for this. It >> just doesn't seem to be used. >> >> It has the signals: >> created(QString) >> deleted(QString) >> dirty(QString) > > I know that. > >> where dirty is a bit abused i guess ;) > > Yes and no. As I said, it's all we know when using other backends than > inotify. > >> > It used to be. Then it moved down to kdecore because, well, it's not only >> > useful in relation to KIO (the network transparency technology). >> > Just like QFileSystemWatcher is also in QtCore. >> >> So where is it going to end up in KF5? > > KCoreAddons. (Doesn't really matter for this discussion, does it?) > > Well, unless you want to work on QFileSystemWatcher, which would be even > better in the long run. But you would start from even further there, it > doesn't even have created/deleted signals (although I thought I saw someone > starting to work on that a year ago or so... clearly that wasn't completed). > >> + a little idea. While monitoring the inotify events i noticed that >> you can have quite a bit of stat overhead just by updating a file in a >> way that dolphin is doing it. The .directory. It is probably the >> safest way with minimal risk of data loss. >> So now i was thinking, if i try to improve this code to actually emit >> the create and delete signals as well then in this case you will end >> up with some needless signals. Lets take my initial example, these are >> send from inotify: >> >> (CREATE signal) /home/mark/massive_folder/.directory.lock >> (MODIFY signal) /home/mark/massive_folder/.directory.lock >> (CREATE signal) /home/mark/massive_folder/.directoryc13357.new >> (MODIFY signal) /home/mark/massive_folder/.directoryc13357.new >> (DELETE signal) /home/mark/massive_folder/.directoryc13357.new >> (CREATE signal) /home/mark/massive_folder/.directory >> (DELETE signal) /home/mark/massive_folder/.directory.lock > > Right. Which shows that inotify has very low-level notifications, while > ideally we'd like KDirWatch to only say created(".directory") as a result of > all this... > >> in a fixed KDirWatch that would become: >> >> (CREATED signal) /home/mark/massive_folder/.directory.lock >> (DIRTY signal) /home/mark/massive_folder/.directory.lock >> (CREATED signal) /home/mark/massive_folder/.directoryc13357.new >> (DIRTY signal) /home/mark/massive_folder/.directoryc13357.new >> (DELETED signal) /home/mark/massive_folder/.directoryc13357.new >> (CREATED signal) /home/mark/massive_folder/.directory >> (DELETED signal) /home/mark/massive_folder/.directory.lock > > We don't want that. See below. > >> That's just in a few miliseconds while the end result is a dirty file: >> /home/mark/massive_folder/.directory >> >> The temporary file and the lock file both get deleted and where not >> there in the first place so there is no need to send any signal about >> those files. >> >> That makes me wonder if we could prevent some signals to be send by >> delaying the signal sending for - lets say - 100ms. If during that >> time another inotify event comes by that makes it pointless to send a >> created signal (for example an inotify that tells us the file is >> deleted again, like the lock file) then we simply don't send any >> signal for that file. This would be somewhat the same like the >> batching i implemented for listDir. It would be sending >> created/deleted/dirty signals if needed for every 100ms as long as >> inotify events stream in. > > That's already a bit like what KDirWatch slotRescan is doing - it's called > after a timer, in order to compress some events. > > Note that this is also why there are additional stat() calls. If I remember > the KDirWatch code correctly, we basically interpret the inotify signals as > "check this path in 500ms" and 500ms later we use stat() to check if the file > is still there, and if anything about it changed, etc. So I think this is > already there to some extent, please look into the signals that are actually > emitted by KDirWatch. > > I agree that we don't want a one-for-one mapping with the inotify events, but > that's not what we do anyway. > >> This will obviously not prevent every case. If a file got added or >> removed just after the 100ms trigger then you'd end up with the same >> signals as in the above example. But it has the potential to prevent >> signals from being send for small file changes. > > This isn't about a recurring timer, but a single-shot timer after the first > event. So as long as the overall operation takes less 100ms the compression > will happen. > > -- > David Faure, faure@kde.org, http://www.davidfaure.fr > Working on KDE, in particular KDE Frameworks 5 > Just responding after all of the above. Thank you very much for all your input, you've helped me tremendously here :) Next up, i'il make a patch (with the unittests still working!) and report back or with a followup discussion if i encounter unforeseen issues (not unlikely in this area).