[prev in list] [next in list] [prev in thread] [next in thread]
List: kde-devel
Subject: Re: Review Request 118231: Scheduler for the akonadi indexer.
From: "Christian Mollekopf" <chrigi_1 () fastmail ! fm>
Date: 2014-05-22 17:06:31
Message-ID: 20140522170631.1196.29165 () probe ! kde ! org
[Download RAW message or body]
[Attachment #2 (multipart/alternative)]
> On May 22, 2014, 4 p.m., Vishesh Handa wrote:
> > src/pim/agent/collectionindexingjob.h, line 66
> > <https://git.reviewboard.kde.org/r/118231/diff/1/?file=273888#file273888line66>
> >
> > Whoa. This variable name is very very confusing. It's true, if there are indexed \
> > items which are unindexed? :O
Right, it's a lock to not repeatedly index not-yet-indexed items. It's necessary so \
we don't end up in an endless loop should we fail to index some items (for whatever \
reason). I'll think of a better name.
> On May 22, 2014, 4 p.m., Vishesh Handa wrote:
> > src/pim/agent/collectionindexingjob.cpp, line 200
> > <https://git.reviewboard.kde.org/r/118231/diff/1/?file=273889#file273889line200>
> >
> > I'm sorry. I'm a little confused as to how this could ever happen unless Akonadi \
> > is messed up and does not reliably send item removed notifications?
The whole point of this codepath is to recover when we miss something. There is no \
guarantee that we manage to process all signals (the akonadi server just fires a dbus \
signal and hopes for the best), so it's entirely possible that we end up in this \
state (and it has in fact happened to me). And if you clear the cache (for whatever \
reason), you definitely end up in this state.
> On May 22, 2014, 4 p.m., Vishesh Handa wrote:
> > src/pim/agent/index.h, line 33
> > <https://git.reviewboard.kde.org/r/118231/diff/1/?file=273890#file273890line33>
> >
> > I'm not too happy with this classes name. How about "Indexer" instead? But then \
> > it would clash with "AbstractIndexer".
I'm open to other suggestions, but IMO Index is quite apt. The Index class \
encapsulates the "index" we're working against (which is actually consisting of many \
small indexes, but that's hidden). Indexer is imo suitable for the code we have that \
processes an item/collection. The idea with the "index" is that you "put an item in \
the index", and what internally happens or where exactly this information ends up is \
irrelevant. Or the other way around, you query the index (not the indexer). So, \
what's wrong with Index? ;-)
> On May 22, 2014, 4 p.m., Vishesh Handa wrote:
> > src/pim/agent/index.cpp, line 128
> > <https://git.reviewboard.kde.org/r/118231/diff/1/?file=273891#file273891line128>
> >
> > Is this required? From a Xapian point of view you're spending an extra amount of \
> > time first removing the data and then adding it back again.
> > If you just index the item, Xapian will internally do a diff on the terms that \
> > have changed, and then just update those.
The original code was like that AFAIK so I kept it that way, but I'll change it then.
> On May 22, 2014, 4 p.m., Vishesh Handa wrote:
> > src/pim/agent/scheduler.cpp, line 36
> > <https://git.reviewboard.kde.org/r/118231/diff/1/?file=273893#file273893line36>
> >
> > Why 100?
Random value for event compression.
> On May 22, 2014, 4 p.m., Vishesh Handa wrote:
> > src/pim/agent/scheduler.h, line 66
> > <https://git.reviewboard.kde.org/r/118231/diff/1/?file=273892#file273892line66>
> >
> > This is confusing. Perhaps some more documentation?
> >
> > Also, couldn't you just directly do a
> >
> > QMap<Akonadi::Collection::ID, QQueue<Akonadi::Item::ID>>
> >
> > I'm not sure what the shared pointer is doing.
I think I overlooked the [] operator that allows me to access the queue by reference, \
and I wasn't sure whether QMap copies the values internally, but I suppose it \
shouldn't. The shared pointer is just for automatic lifetime management (no leaks \
guaranteed).
I suppose I can change that to just use a QQueue as you suggested.
> On May 22, 2014, 4 p.m., Vishesh Handa wrote:
> > src/pim/agent/scheduler.cpp, line 46
> > <https://git.reviewboard.kde.org/r/118231/diff/1/?file=273893#file273893line46>
> >
> > This is quite dangerous. If an email is not indexed it results in the entire \
> > collection being sync and all a full collection fetch job going on.
> > I'm very much against this. I get enough angry emails about how the baloo indexer \
> > is sucking all their cpu.
Either this or you only have half your emails indexed.
Note that this codepath is only triggered if we fail to index items in a timely \
fashion, and since indexing is fairly fast that should seldomly be the case, right?
> On May 22, 2014, 4 p.m., Vishesh Handa wrote:
> > src/pim/agent/scheduler.cpp, line 53
> > <https://git.reviewboard.kde.org/r/118231/diff/1/?file=273893#file273893line53>
> >
> > So, even if someone switched off their system before the initial indexing was \
> > done, we mark the initial indexing as completed?
The not-yet completed collections are remembered as part of the "dirty collections" \
which should take care of this.
> On May 22, 2014, 4 p.m., Vishesh Handa wrote:
> > src/pim/agent/scheduler.cpp, line 97
> > <https://git.reviewboard.kde.org/r/118231/diff/1/?file=273893#file273893line97>
> >
> > So each time an item is added, you pass the parent collection to the \
> > CollectionIndexingJob and there you fetch the entire collection (+statistics) and \
> > do a query on Xapian to check how many items are already indexed?
> > Arguably, the xapian query would be quite fast, but still.
That could be optimized as we only require up-to date statistics for a full sync, but \
I really doubt that fetching a collection will have any noticeable performance \
impact.
- Christian
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/118231/#review58317
-----------------------------------------------------------
On May 20, 2014, 10:22 p.m., Christian Mollekopf wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/118231/
> -----------------------------------------------------------
>
> (Updated May 20, 2014, 10:22 p.m.)
>
>
> Review request for Baloo and Vishesh Handa.
>
>
> Repository: baloo
>
>
> Description
> -------
>
> A scheduler for baloo:
> * delays the indexing until no new item has been added for at least 5 seconds to \
> avoid indexing during a collection sync.
> * remembers if it failed to index something and triggers recovery path on next \
> start.
> * supports manual triggering of recovery path if required.
>
>
> Diffs
> -----
>
> src/pim/agent/CMakeLists.txt e917915a3414738595caea5497859ef4810ec44c
> src/pim/agent/agent.h 1dbf0fc0a16d0615dbfa4878706359bb687facd0
> src/pim/agent/agent.cpp 8904d49d3579b58b634d2570fbcc8007e5ee41ed
> src/pim/agent/collectionindexingjob.h PRE-CREATION
> src/pim/agent/collectionindexingjob.cpp PRE-CREATION
> src/pim/agent/index.h PRE-CREATION
> src/pim/agent/index.cpp PRE-CREATION
> src/pim/agent/scheduler.h PRE-CREATION
> src/pim/agent/scheduler.cpp PRE-CREATION
>
> Diff: https://git.reviewboard.kde.org/r/118231/diff/
>
>
> Testing
> -------
>
> I'm running it for a while, and it reduced the stress that baloo imposed on my \
> system and all my mails are indexed since I'm using it (wasn't the case before).
>
> Thanks,
>
> Christian Mollekopf
>
>
[Attachment #5 (text/html)]
<html>
<body>
<div style="font-family: Verdana, Arial, Helvetica, Sans-Serif;">
<table bgcolor="#f9f3c9" width="100%" cellpadding="8" style="border: 1px #c9c399 \
solid;"> <tr>
<td>
This is an automatically generated e-mail. To reply, visit:
<a href="https://git.reviewboard.kde.org/r/118231/">https://git.reviewboard.kde.org/r/118231/</a>
</td>
</tr>
</table>
<br />
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: \
10px;"> <p style="margin-top: 0;">On May 22nd, 2014, 4 p.m. UTC, <b>Vishesh \
Handa</b> wrote:</p> <blockquote style="margin-left: 1em; border-left: 2px solid \
#d0d0d0; padding-left: 10px;">
<table width="100%" border="0" bgcolor="white" style="border: 1px solid #C0C0C0; \
border-collapse: collapse; margin: 2px padding: 2px;"> <thead>
<tr>
<th colspan="4" bgcolor="#F0F0F0" style="border-bottom: 1px solid #C0C0C0; \
font-size: 9pt; padding: 4px 8px; text-align: left;"> <a \
href="https://git.reviewboard.kde.org/r/118231/diff/1/?file=273888#file273888line66" \
style="color: black; font-weight: bold; text-decoration: \
underline;">src/pim/agent/collectionindexingjob.h</a> <span style="font-weight: \
normal;">
(Diff revision 1)
</span>
</th>
</tr>
</thead>
<tbody>
<tr>
<th bgcolor="#b1ebb0" style="border-right: 1px solid #C0C0C0;" \
align="right"><font size="2"></font></th> <td bgcolor="#c5ffc4" width="50%"><pre \
style="font-size: 8pt; line-height: 140%; margin: 0; "></pre></td> <th \
bgcolor="#b1ebb0" style="border-left: 1px solid #C0C0C0; border-right: 1px solid \
#C0C0C0;" align="right"><font size="2">66</font></th> <td bgcolor="#c5ffc4" \
width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; "> <span \
class="kt">bool</span> <span class="n">m_indexedUnindexed</span><span \
class="p">;</span></pre></td> </tr>
</tbody>
</table>
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: \
-pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">Whoa. This variable name \
is very very confusing. It's true, if there are indexed items which are \
unindexed? :O</pre> </blockquote>
</blockquote>
<pre style="margin-left: 1em; white-space: pre-wrap; white-space: -moz-pre-wrap; \
white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">Right, \
it's a lock to not repeatedly index not-yet-indexed items. It's necessary so \
we don't end up in an endless loop should we fail to index some items (for \
whatever reason). I'll think of a better name.</pre>
<br />
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: \
10px;"> <p style="margin-top: 0;">On May 22nd, 2014, 4 p.m. UTC, <b>Vishesh \
Handa</b> wrote:</p> <blockquote style="margin-left: 1em; border-left: 2px solid \
#d0d0d0; padding-left: 10px;">
<table width="100%" border="0" bgcolor="white" style="border: 1px solid #C0C0C0; \
border-collapse: collapse; margin: 2px padding: 2px;"> <thead>
<tr>
<th colspan="4" bgcolor="#F0F0F0" style="border-bottom: 1px solid #C0C0C0; \
font-size: 9pt; padding: 4px 8px; text-align: left;"> <a \
href="https://git.reviewboard.kde.org/r/118231/diff/1/?file=273889#file273889line200" \
style="color: black; font-weight: bold; text-decoration: \
underline;">src/pim/agent/collectionindexingjob.cpp</a> <span style="font-weight: \
normal;">
(Diff revision 1)
</span>
</th>
</tr>
</thead>
<tbody>
<tr>
<th bgcolor="#b1ebb0" style="border-right: 1px solid #C0C0C0;" \
align="right"><font size="2"></font></th> <td bgcolor="#c5ffc4" width="50%"><pre \
style="font-size: 8pt; line-height: 140%; margin: 0; "></pre></td> <th \
bgcolor="#b1ebb0" style="border-left: 1px solid #C0C0C0; border-right: 1px solid \
#C0C0C0;" align="right"><font size="2">200</font></th> <td bgcolor="#c5ffc4" \
width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; "> <span \
class="k">if</span> <span class="p">(</span><span class="o">!</span><span \
class="n">m_indexedItems</span><span class="p">.</span><span \
class="n">isEmpty</span><span class="p">())</span> <span \
class="p">{</span></pre></td> </tr>
</tbody>
</table>
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: \
-pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">I'm sorry. I'm a \
little confused as to how this could ever happen unless Akonadi is messed up and does \
not reliably send item removed notifications?</pre> </blockquote>
</blockquote>
<pre style="margin-left: 1em; white-space: pre-wrap; white-space: -moz-pre-wrap; \
white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">The whole \
point of this codepath is to recover when we miss something. There is no guarantee \
that we manage to process all signals (the akonadi server just fires a dbus signal \
and hopes for the best), so it's entirely possible that we end up in this state \
(and it has in fact happened to me). And if you clear the cache (for whatever \
reason), you definitely end up in this state.</pre> <br />
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: \
10px;"> <p style="margin-top: 0;">On May 22nd, 2014, 4 p.m. UTC, <b>Vishesh \
Handa</b> wrote:</p> <blockquote style="margin-left: 1em; border-left: 2px solid \
#d0d0d0; padding-left: 10px;">
<table width="100%" border="0" bgcolor="white" style="border: 1px solid #C0C0C0; \
border-collapse: collapse; margin: 2px padding: 2px;"> <thead>
<tr>
<th colspan="4" bgcolor="#F0F0F0" style="border-bottom: 1px solid #C0C0C0; \
font-size: 9pt; padding: 4px 8px; text-align: left;"> <a \
href="https://git.reviewboard.kde.org/r/118231/diff/1/?file=273890#file273890line33" \
style="color: black; font-weight: bold; text-decoration: \
underline;">src/pim/agent/index.h</a> <span style="font-weight: normal;">
(Diff revision 1)
</span>
</th>
</tr>
</thead>
<tbody>
<tr>
<th bgcolor="#b1ebb0" style="border-right: 1px solid #C0C0C0;" \
align="right"><font size="2"></font></th> <td bgcolor="#c5ffc4" width="50%"><pre \
style="font-size: 8pt; line-height: 140%; margin: 0; "></pre></td> <th \
bgcolor="#b1ebb0" style="border-left: 1px solid #C0C0C0; border-right: 1px solid \
#C0C0C0;" align="right"><font size="2">33</font></th> <td bgcolor="#c5ffc4" \
width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; "><span \
class="n">class</span> <span class="n">Index</span> <span class="o">:</span> <span \
class="n">public</span> <span class="n">QObject</span></pre></td> </tr>
</tbody>
</table>
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: \
-pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">I'm not too happy \
with this classes name. How about "Indexer" instead? But then it would \
clash with "AbstractIndexer".</pre> </blockquote>
</blockquote>
<pre style="margin-left: 1em; white-space: pre-wrap; white-space: -moz-pre-wrap; \
white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">I'm \
open to other suggestions, but IMO Index is quite apt. The Index class encapsulates \
the "index" we're working against (which is actually consisting of many \
small indexes, but that's hidden). Indexer is imo suitable for the code we have \
that processes an item/collection. The idea with the "index" is that you \
"put an item in the index", and what internally happens or where exactly \
this information ends up is irrelevant. Or the other way around, you query the index \
(not the indexer). So, what's wrong with Index? ;-)</pre>
<br />
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: \
10px;"> <p style="margin-top: 0;">On May 22nd, 2014, 4 p.m. UTC, <b>Vishesh \
Handa</b> wrote:</p> <blockquote style="margin-left: 1em; border-left: 2px solid \
#d0d0d0; padding-left: 10px;">
<table width="100%" border="0" bgcolor="white" style="border: 1px solid #C0C0C0; \
border-collapse: collapse; margin: 2px padding: 2px;"> <thead>
<tr>
<th colspan="4" bgcolor="#F0F0F0" style="border-bottom: 1px solid #C0C0C0; \
font-size: 9pt; padding: 4px 8px; text-align: left;"> <a \
href="https://git.reviewboard.kde.org/r/118231/diff/1/?file=273891#file273891line128" \
style="color: black; font-weight: bold; text-decoration: \
underline;">src/pim/agent/index.cpp</a> <span style="font-weight: normal;">
(Diff revision 1)
</span>
</th>
</tr>
</thead>
<tbody>
<tr>
<th bgcolor="#b1ebb0" style="border-right: 1px solid #C0C0C0;" \
align="right"><font size="2"></font></th> <td bgcolor="#c5ffc4" width="50%"><pre \
style="font-size: 8pt; line-height: 140%; margin: 0; "></pre></td> <th \
bgcolor="#b1ebb0" style="border-left: 1px solid #C0C0C0; border-right: 1px solid \
#C0C0C0;" align="right"><font size="2">128</font></th> <td bgcolor="#c5ffc4" \
width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; "><span \
class="kt">void</span> <span class="n">Index</span><span class="o">::</span><span \
class="n">reindex</span><span class="p">(</span><span class="k">const</span> <span \
class="n">Akonadi</span><span class="o">::</span><span class="n">Item</span><span \
class="o">&</span> <span class="n">item</span><span class="p">)</span></pre></td> \
</tr>
</tbody>
</table>
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: \
-pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">Is this required? From a \
Xapian point of view you're spending an extra amount of time first removing the \
data and then adding it back again.
If you just index the item, Xapian will internally do a diff on the terms that have \
changed, and then just update those.</pre> </blockquote>
</blockquote>
<pre style="margin-left: 1em; white-space: pre-wrap; white-space: -moz-pre-wrap; \
white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">The \
original code was like that AFAIK so I kept it that way, but I'll change it \
then.</pre> <br />
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: \
10px;"> <p style="margin-top: 0;">On May 22nd, 2014, 4 p.m. UTC, <b>Vishesh \
Handa</b> wrote:</p> <blockquote style="margin-left: 1em; border-left: 2px solid \
#d0d0d0; padding-left: 10px;">
<table width="100%" border="0" bgcolor="white" style="border: 1px solid #C0C0C0; \
border-collapse: collapse; margin: 2px padding: 2px;"> <thead>
<tr>
<th colspan="4" bgcolor="#F0F0F0" style="border-bottom: 1px solid #C0C0C0; \
font-size: 9pt; padding: 4px 8px; text-align: left;"> <a \
href="https://git.reviewboard.kde.org/r/118231/diff/1/?file=273892#file273892line66" \
style="color: black; font-weight: bold; text-decoration: \
underline;">src/pim/agent/scheduler.h</a> <span style="font-weight: normal;">
(Diff revision 1)
</span>
</th>
</tr>
</thead>
<tbody>
<tr>
<th bgcolor="#b1ebb0" style="border-right: 1px solid #C0C0C0;" \
align="right"><font size="2"></font></th> <td bgcolor="#c5ffc4" width="50%"><pre \
style="font-size: 8pt; line-height: 140%; margin: 0; "></pre></td> <th \
bgcolor="#b1ebb0" style="border-left: 1px solid #C0C0C0; border-right: 1px solid \
#C0C0C0;" align="right"><font size="2">66</font></th> <td bgcolor="#c5ffc4" \
width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; "> <span \
class="n">QMap</span><span class="o"><</span><span class="n">Akonadi</span><span \
class="o">::</span><span class="n">Collection</span><span class="o">::</span><span \
class="n">Id</span><span class="p">,</span> <span \
class="n">QSharedPointer</span><span class="o"><</span><span \
class="n">QQueue</span><span class="o"><</span><span class="n">Akonadi</span><span \
class="o">::</span><span class="n">Item</span><span class="o">::</span><span \
class="n">Id</span><span class="o">></span> <span class="o">></span> <span \
class="o">></span> <span class="n">m_queues</span><span \
class="p">;</span></pre></td> </tr>
</tbody>
</table>
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: \
-pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">This is confusing. \
Perhaps some more documentation?
Also, couldn't you just directly do a
QMap<Akonadi::Collection::ID, QQueue<Akonadi::Item::ID>>
I'm not sure what the shared pointer is doing.</pre>
</blockquote>
</blockquote>
<pre style="margin-left: 1em; white-space: pre-wrap; white-space: -moz-pre-wrap; \
white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">I think I \
overlooked the [] operator that allows me to access the queue by reference, and I \
wasn't sure whether QMap copies the values internally, but I suppose it \
shouldn't. The shared pointer is just for automatic lifetime management (no leaks \
guaranteed).
I suppose I can change that to just use a QQueue as you suggested.</pre>
<br />
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: \
10px;"> <p style="margin-top: 0;">On May 22nd, 2014, 4 p.m. UTC, <b>Vishesh \
Handa</b> wrote:</p> <blockquote style="margin-left: 1em; border-left: 2px solid \
#d0d0d0; padding-left: 10px;">
<table width="100%" border="0" bgcolor="white" style="border: 1px solid #C0C0C0; \
border-collapse: collapse; margin: 2px padding: 2px;"> <thead>
<tr>
<th colspan="4" bgcolor="#F0F0F0" style="border-bottom: 1px solid #C0C0C0; \
font-size: 9pt; padding: 4px 8px; text-align: left;"> <a \
href="https://git.reviewboard.kde.org/r/118231/diff/1/?file=273893#file273893line36" \
style="color: black; font-weight: bold; text-decoration: \
underline;">src/pim/agent/scheduler.cpp</a> <span style="font-weight: normal;">
(Diff revision 1)
</span>
</th>
</tr>
</thead>
<tbody>
<tr>
<th bgcolor="#b1ebb0" style="border-right: 1px solid #C0C0C0;" \
align="right"><font size="2"></font></th> <td bgcolor="#c5ffc4" width="50%"><pre \
style="font-size: 8pt; line-height: 140%; margin: 0; "></pre></td> <th \
bgcolor="#b1ebb0" style="border-left: 1px solid #C0C0C0; border-right: 1px solid \
#C0C0C0;" align="right"><font size="2">36</font></th> <td bgcolor="#c5ffc4" \
width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; "> <span \
class="n">m_processTimer</span><span class="p">.</span><span \
class="n">setInterval</span><span class="p">(</span><span class="mi">100</span><span \
class="p">);</span></pre></td> </tr>
</tbody>
</table>
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: \
-pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">Why 100?</pre> \
</blockquote>
</blockquote>
<pre style="margin-left: 1em; white-space: pre-wrap; white-space: -moz-pre-wrap; \
white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">Random \
value for event compression.</pre> <br />
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: \
10px;"> <p style="margin-top: 0;">On May 22nd, 2014, 4 p.m. UTC, <b>Vishesh \
Handa</b> wrote:</p> <blockquote style="margin-left: 1em; border-left: 2px solid \
#d0d0d0; padding-left: 10px;">
<table width="100%" border="0" bgcolor="white" style="border: 1px solid #C0C0C0; \
border-collapse: collapse; margin: 2px padding: 2px;"> <thead>
<tr>
<th colspan="4" bgcolor="#F0F0F0" style="border-bottom: 1px solid #C0C0C0; \
font-size: 9pt; padding: 4px 8px; text-align: left;"> <a \
href="https://git.reviewboard.kde.org/r/118231/diff/1/?file=273893#file273893line46" \
style="color: black; font-weight: bold; text-decoration: \
underline;">src/pim/agent/scheduler.cpp</a> <span style="font-weight: normal;">
(Diff revision 1)
</span>
</th>
</tr>
</thead>
<tbody>
<tr>
<th bgcolor="#b1ebb0" style="border-right: 1px solid #C0C0C0;" \
align="right"><font size="2"></font></th> <td bgcolor="#c5ffc4" width="50%"><pre \
style="font-size: 8pt; line-height: 140%; margin: 0; "></pre></td> <th \
bgcolor="#b1ebb0" style="border-left: 1px solid #C0C0C0; border-right: 1px solid \
#C0C0C0;" align="right"><font size="2">46</font></th> <td bgcolor="#c5ffc4" \
width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; "> <span \
class="n">scheduleCollection</span><span class="p">(</span><span \
class="n">Akonadi</span><span class="o">::</span><span \
class="n">Collection</span><span class="p">(</span><span class="n">col</span><span \
class="p">),</span> <span class="nb">true</span><span class="p">);</span></pre></td> \
</tr>
</tbody>
</table>
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: \
-pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">This is quite dangerous. \
If an email is not indexed it results in the entire collection being sync and all a \
full collection fetch job going on.
I'm very much against this. I get enough angry emails about how the baloo indexer \
is sucking all their cpu.</pre> </blockquote>
</blockquote>
<pre style="margin-left: 1em; white-space: pre-wrap; white-space: -moz-pre-wrap; \
white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">Either this \
or you only have half your emails indexed. Note that this codepath is only triggered \
if we fail to index items in a timely fashion, and since indexing is fairly fast that \
should seldomly be the case, right?</pre> <br />
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: \
10px;"> <p style="margin-top: 0;">On May 22nd, 2014, 4 p.m. UTC, <b>Vishesh \
Handa</b> wrote:</p> <blockquote style="margin-left: 1em; border-left: 2px solid \
#d0d0d0; padding-left: 10px;">
<table width="100%" border="0" bgcolor="white" style="border: 1px solid #C0C0C0; \
border-collapse: collapse; margin: 2px padding: 2px;"> <thead>
<tr>
<th colspan="4" bgcolor="#F0F0F0" style="border-bottom: 1px solid #C0C0C0; \
font-size: 9pt; padding: 4px 8px; text-align: left;"> <a \
href="https://git.reviewboard.kde.org/r/118231/diff/1/?file=273893#file273893line53" \
style="color: black; font-weight: bold; text-decoration: \
underline;">src/pim/agent/scheduler.cpp</a> <span style="font-weight: normal;">
(Diff revision 1)
</span>
</th>
</tr>
</thead>
<tbody>
<tr>
<th bgcolor="#b1ebb0" style="border-right: 1px solid #C0C0C0;" \
align="right"><font size="2"></font></th> <td bgcolor="#c5ffc4" width="50%"><pre \
style="font-size: 8pt; line-height: 140%; margin: 0; "></pre></td> <th \
bgcolor="#b1ebb0" style="border-left: 1px solid #C0C0C0; border-right: 1px solid \
#C0C0C0;" align="right"><font size="2">53</font></th> <td bgcolor="#c5ffc4" \
width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; "> <span \
class="n">group</span><span class="p">.</span><span class="n">writeEntry</span><span \
class="p">(</span><span class="s">"initialIndexing"</span><span \
class="p">,</span> <span class="nb">true</span><span class="p">);</span></pre></td> \
</tr>
</tbody>
</table>
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: \
-pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">So, even if someone \
switched off their system before the initial indexing was done, we mark the initial \
indexing as completed?</pre> </blockquote>
</blockquote>
<pre style="margin-left: 1em; white-space: pre-wrap; white-space: -moz-pre-wrap; \
white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">The not-yet \
completed collections are remembered as part of the "dirty collections" \
which should take care of this.</pre> <br />
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: \
10px;"> <p style="margin-top: 0;">On May 22nd, 2014, 4 p.m. UTC, <b>Vishesh \
Handa</b> wrote:</p> <blockquote style="margin-left: 1em; border-left: 2px solid \
#d0d0d0; padding-left: 10px;">
<table width="100%" border="0" bgcolor="white" style="border: 1px solid #C0C0C0; \
border-collapse: collapse; margin: 2px padding: 2px;"> <thead>
<tr>
<th colspan="4" bgcolor="#F0F0F0" style="border-bottom: 1px solid #C0C0C0; \
font-size: 9pt; padding: 4px 8px; text-align: left;"> <a \
href="https://git.reviewboard.kde.org/r/118231/diff/1/?file=273893#file273893line97" \
style="color: black; font-weight: bold; text-decoration: \
underline;">src/pim/agent/scheduler.cpp</a> <span style="font-weight: normal;">
(Diff revision 1)
</span>
</th>
</tr>
</thead>
<tbody>
<tr>
<th bgcolor="#b1ebb0" style="border-right: 1px solid #C0C0C0;" \
align="right"><font size="2"></font></th> <td bgcolor="#c5ffc4" width="50%"><pre \
style="font-size: 8pt; line-height: 140%; margin: 0; "></pre></td> <th \
bgcolor="#b1ebb0" style="border-left: 1px solid #C0C0C0; border-right: 1px solid \
#C0C0C0;" align="right"><font size="2">97</font></th> <td bgcolor="#c5ffc4" \
width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; "><span \
class="kt">void</span> <span class="n">Scheduler</span><span class="o">::</span><span \
class="n">addItem</span><span class="p">(</span><span class="k">const</span> <span \
class="n">Akonadi</span><span class="o">::</span><span class="n">Item</span> <span \
class="o">&</span><span class="n">item</span><span class="p">)</span></pre></td> \
</tr>
</tbody>
</table>
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: \
-pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">So each time an item is \
added, you pass the parent collection to the CollectionIndexingJob and there you \
fetch the entire collection (+statistics) and do a query on Xapian to check how many \
items are already indexed?
Arguably, the xapian query would be quite fast, but still.</pre>
</blockquote>
</blockquote>
<pre style="margin-left: 1em; white-space: pre-wrap; white-space: -moz-pre-wrap; \
white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">That could \
be optimized as we only require up-to date statistics for a full sync, but I really \
doubt that fetching a collection will have any noticeable performance impact. </pre> \
<br />
<p>- Christian</p>
<br />
<p>On May 20th, 2014, 10:22 p.m. UTC, Christian Mollekopf wrote:</p>
<table bgcolor="#fefadf" width="100%" cellspacing="0" cellpadding="8" \
style="background-image: \
url('https://git.reviewboard.kde.org/static/rb/images/review_request_box_top_bg.ab6f3b1072c9.png'); \
background-position: left top; background-repeat: repeat-x; border: 1px black \
solid;"> <tr>
<td>
<div>Review request for Baloo and Vishesh Handa.</div>
<div>By Christian Mollekopf.</div>
<p style="color: grey;"><i>Updated May 20, 2014, 10:22 p.m.</i></p>
<div style="margin-top: 1.5em;">
<b style="color: #575012; font-size: 10pt;">Repository: </b>
baloo
</div>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Description </h1>
<table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" \
style="border: 1px solid #b8b5a0"> <tr>
<td>
<pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: \
-moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">A scheduler for baloo:
* delays the indexing until no new item has been added for at least 5 seconds to \
avoid indexing during a collection sync.
* remembers if it failed to index something and triggers recovery path on next start.
* supports manual triggering of recovery path if required.</pre>
</td>
</tr>
</table>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Testing </h1>
<table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" style="border: \
1px solid #b8b5a0"> <tr>
<td>
<pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: \
-moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">I'm running it for a while, and it reduced the stress that baloo \
imposed on my system and all my mails are indexed since I'm using it (wasn't \
the case before).</pre> </td>
</tr>
</table>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Diffs</b> </h1>
<ul style="margin-left: 3em; padding-left: 0;">
<li>src/pim/agent/CMakeLists.txt <span style="color: \
grey">(e917915a3414738595caea5497859ef4810ec44c)</span></li>
<li>src/pim/agent/agent.h <span style="color: \
grey">(1dbf0fc0a16d0615dbfa4878706359bb687facd0)</span></li>
<li>src/pim/agent/agent.cpp <span style="color: \
grey">(8904d49d3579b58b634d2570fbcc8007e5ee41ed)</span></li>
<li>src/pim/agent/collectionindexingjob.h <span style="color: \
grey">(PRE-CREATION)</span></li>
<li>src/pim/agent/collectionindexingjob.cpp <span style="color: \
grey">(PRE-CREATION)</span></li>
<li>src/pim/agent/index.h <span style="color: grey">(PRE-CREATION)</span></li>
<li>src/pim/agent/index.cpp <span style="color: grey">(PRE-CREATION)</span></li>
<li>src/pim/agent/scheduler.h <span style="color: grey">(PRE-CREATION)</span></li>
<li>src/pim/agent/scheduler.cpp <span style="color: grey">(PRE-CREATION)</span></li>
</ul>
<p><a href="https://git.reviewboard.kde.org/r/118231/diff/" style="margin-left: \
3em;">View Diff</a></p>
</td>
</tr>
</table>
</div>
</body>
</html>
>> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to unsubscribe <<
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic