'Re: [Kde-pim] Nepomukfeeder updates almost ready'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kde-pim
Subject:    Re: [Kde-pim] Nepomukfeeder updates almost ready
From:       Àlex_Fiestas <afiestas () kde ! org>
Date:       2012-12-28 9:22:08
Message-ID: CAJVyKoHuG0RXGxVB7S3ge3Wj+03aJNzMKJM6xDW3gr6R=m5O-Q () mail ! gmail ! com
[Download RAW message or body]

I'm all for leveraging developers decisions above cycles, so if the
maintainer thinks that the new code is better go with it and be ready to
possible bugs fast :p

Cheers
On Dec 27, 2012 10:06 PM, "Allen Winter" <winter@kde.org> wrote:

> On Wednesday 26 December 2012 04:44:53 PM Christian Mollekopf wrote:
> > Hey,
> >
> > I made another bunch of fixes, turned the finding of skipped items into a
> > recurring task, and turn the change-recorder off now if the feeder is
> disabled
> > entirely. In my testing so far this system behaves much better than what
> we
> > used to have.
> >
> > I plan on committing this to 4.10 if noone objects within the next days.
> (I'll
> > write a mail to release-team first).
> >
> > The code is here:
> > http://quickgit.kde.org/?p=clones%2Fkdepim-
> >
> runtime%2Fcmollekopf%2FpimRuntimeClone.git&a=shortlog&h=c2ca91566953c57af119634f65b5bd73bac7e7fa
> >
> > Cheers,
> > Christian
> >
> >
> > On Sunday 23 December 2012 17.54:18 Christian Mollekopf wrote:
> > > Heya,
> > >
> > > To cut right to the chase; I revamped the feeders a bit, think it's
> much
> > > better than what we had before, and would like to get it into 4.10. So
> feel
> > > free to skip if you don't care.
> > >
> > > I moved to a recurring, query based approach for the initial-indexing.
> That
> > > means, instead of doing a single initial-indexing when the feeder is
> > > executed the first time, and relying purely on updates from the
> > > change-recorder afterwards, the initial-indexing is now more a
> maintenance
> > > task (which is currently running on every start), and queries for all
> not
> > > yet indexed items.
> > >
> > > That is necessary, as the initial assumption that we can index items
> faster
> > > than notifications come in didn't hold true, which resulted in the
> feeder
> > > regularly being overloaded with stuff to index.
> > >
> > > The initial query approach resulted in n queries for n items, which is
> way
> > > too slow to be feasible for all items (it is taking ages, literally).
> The
> > > only alternative approach I found is; we run two queries, one in
> akonadi
> > > and one in nepomuk, each querying for *all* available items. Comparing
> the
> > > two lists, results in the list of items which have not been indexed
> yet. Of
> > > course, that misses any changes on items which have been indexed
> before,
> > > but have been modified since then, so it's not ideal either.
> > > These queries are fairly efficient as they result in a single sql
> query per
> > > db (as opposed to n),  although with a huge result set. I could query
> my db
> > > of ~100'000 items in ~20s (i7 processor).
> > >
> > > Since I figured changes on emails, which are mostly just flags, are
> > > negligible, I switched the email initial-indexing to that new approach.
> > >
> > > Non-email items continue to be indexed as usual, meaning there is one
> query
> > > per item, which allows us to detect modifications as well. That is
> slow as
> > > usual, but since we usually have a lot more email items than non-email
> > > items, it works well enough.
> > >
> > > Another important advantage is that we can thus now also skip large
> batches
> > > of new/changed items, knowing they will be picked up by the
> > > initial-indexing eventually. That also allows us to turn off the
> > > change-recorder when the feeder is turned off (which is another
> problem if
> > > we rely on the change- recorder too much).
> > >
> > > One remaining problem is that we get loads of notifications of
> changed/added
> > > items, which I think are mostly due to sync-on-demand updates,
> updating the
> > > cache (and not actual new emails or whatnot). I also often get flag
> change
> > > notifications on my offline imap accounts, which I don't really know
> why
> > > yet. That of course would lead to loads of items being indexed over and
> > > over again, but that can be mitigated somewhat since we now can skip
> larger
> > > batches of items.
> > >
> > > Besides I made some performance improvements, such as the cache I
> mentioned
> > > previously (200% performance boost), or that new items are now indexed
> > > without any queries, which gives another boost of 10%-20% or so.
> > >
> > > Overall, I think we should get this into 4.10 as fast as possible. The
> patch
> > > is somewhat large (and way to late in the process), but IMO the
> previous
> > > feeders are broken enough to justify this. So what do you think?
> Should I
> > > commit this to 4.10 in a couple of commits, or only master and then
> > > backport it for 4.10.1?
> > >
>
> Are there any objections to getting this work committed for 4.10?
> It's awfully late in the release cycle to be pushing for this, but I will
> do so if I get warm-fuzzies from a couple more folks that we need it.
>
> Anyone want to chime in here?
> Please do so ASAP.
>
> _______________________________________________
> KDE PIM mailing list kde-pim@kde.org
> https://mail.kde.org/mailman/listinfo/kde-pim
> KDE PIM home page at http://pim.kde.org/
>
_______________________________________________
KDE PIM mailing list kde-pim@kde.org
https://mail.kde.org/mailman/listinfo/kde-pim
KDE PIM home page at http://pim.kde.org/
[prev in list] [next in list] [prev in thread] [next in thread]