[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kde-devel
Subject:    Re: Review Request 118231: Scheduler for the akonadi indexer.
From:       "Christian Mollekopf" <chrigi_1 () fastmail ! fm>
Date:       2014-05-22 17:06:31
Message-ID: 20140522170631.1196.29165 () probe ! kde ! org
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


> On May 22, 2014, 4 p.m., Vishesh Handa wrote:
> > src/pim/agent/collectionindexingjob.h, line 66
> > <https://git.reviewboard.kde.org/r/118231/diff/1/?file=273888#file273888line66>
> >  
> > Whoa. This variable name is very very confusing. It's true, if there \
> > are indexed items which are unindexed? :O

Right, it's a lock to not repeatedly index not-yet-indexed items. It's \
necessary so we don't end up in an endless loop should we fail to index \
some items (for whatever reason). I'll think of a better name.


> On May 22, 2014, 4 p.m., Vishesh Handa wrote:
> > src/pim/agent/collectionindexingjob.cpp, line 200
> > <https://git.reviewboard.kde.org/r/118231/diff/1/?file=273889#file273889line200>
> >  
> > I'm sorry. I'm a little confused as to how this could ever happen \
> > unless Akonadi is messed up and does not reliably send item removed \
> > notifications?

The whole point of this codepath is to recover when we miss something. \
There is no guarantee that we manage to process all signals (the akonadi \
server just fires a dbus signal and hopes for the best), so it's entirely \
possible that we end up in this state (and it has in fact happened to me). \
And if you clear the cache (for whatever reason), you definitely end up in \
this state.


> On May 22, 2014, 4 p.m., Vishesh Handa wrote:
> > src/pim/agent/index.h, line 33
> > <https://git.reviewboard.kde.org/r/118231/diff/1/?file=273890#file273890line33>
> >  
> > I'm not too happy with this classes name. How about "Indexer" instead? \
> > But then it would clash with "AbstractIndexer".

I'm open to other suggestions, but IMO Index is quite apt. The Index class \
encapsulates the "index" we're working against (which is actually \
consisting of many small indexes, but that's hidden). Indexer is imo \
suitable for the code we have that processes an item/collection. The idea \
with the "index" is that you "put an item in the index", and what \
internally happens or where exactly this information ends up is irrelevant. \
Or the other way around, you query the index (not the indexer). So, what's \
wrong with Index? ;-)


> On May 22, 2014, 4 p.m., Vishesh Handa wrote:
> > src/pim/agent/index.cpp, line 128
> > <https://git.reviewboard.kde.org/r/118231/diff/1/?file=273891#file273891line128>
> >  
> > Is this required? From a Xapian point of view you're spending an extra \
> > amount of time first removing the data and then adding it back again. 
> > If you just index the item, Xapian will internally do a diff on the \
> > terms that have changed, and then just update those.

The original code was like that AFAIK so I kept it that way, but I'll \
change it then.


> On May 22, 2014, 4 p.m., Vishesh Handa wrote:
> > src/pim/agent/scheduler.cpp, line 36
> > <https://git.reviewboard.kde.org/r/118231/diff/1/?file=273893#file273893line36>
> >  
> > Why 100?

Random value for event compression.


> On May 22, 2014, 4 p.m., Vishesh Handa wrote:
> > src/pim/agent/scheduler.h, line 66
> > <https://git.reviewboard.kde.org/r/118231/diff/1/?file=273892#file273892line66>
> >  
> > This is confusing. Perhaps some more documentation?
> > 
> > Also, couldn't you just directly do a
> > 
> > QMap<Akonadi::Collection::ID, QQueue<Akonadi::Item::ID>>
> > 
> > I'm not sure what the shared pointer is doing.

I think I overlooked the [] operator that allows me to access the queue by \
reference, and I wasn't sure whether QMap copies the values internally, but \
I suppose it shouldn't. The shared pointer is just for automatic lifetime \
management (no leaks guaranteed).

I suppose I can change that to just use a QQueue as you suggested.


> On May 22, 2014, 4 p.m., Vishesh Handa wrote:
> > src/pim/agent/scheduler.cpp, line 46
> > <https://git.reviewboard.kde.org/r/118231/diff/1/?file=273893#file273893line46>
> >  
> > This is quite dangerous. If an email is not indexed it results in the \
> > entire collection being sync and all a full collection fetch job going \
> > on. 
> > I'm very much against this. I get enough angry emails about how the \
> > baloo indexer is sucking all their cpu.

Either this or you only have half your emails indexed.
Note that this codepath is only triggered if we fail to index items in a \
timely fashion, and since indexing is fairly fast that should seldomly be \
the case, right?


> On May 22, 2014, 4 p.m., Vishesh Handa wrote:
> > src/pim/agent/scheduler.cpp, line 53
> > <https://git.reviewboard.kde.org/r/118231/diff/1/?file=273893#file273893line53>
> >  
> > So, even if someone switched off their system before the initial \
> > indexing was done, we mark the initial indexing as completed?

The not-yet completed collections are remembered as part of the "dirty \
collections" which should take care of this.


> On May 22, 2014, 4 p.m., Vishesh Handa wrote:
> > src/pim/agent/scheduler.cpp, line 97
> > <https://git.reviewboard.kde.org/r/118231/diff/1/?file=273893#file273893line97>
> >  
> > So each time an item is added, you pass the parent collection to the \
> > CollectionIndexingJob and there you fetch the entire collection \
> > (+statistics) and do a query on Xapian to check how many items are \
> > already indexed? 
> > Arguably, the xapian query would be quite fast, but still.

That could be optimized as we only require up-to date statistics for a full \
sync, but I really doubt that fetching a collection will have any \
noticeable performance impact. 


- Christian


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/118231/#review58317
-----------------------------------------------------------


On May 20, 2014, 10:22 p.m., Christian Mollekopf wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/118231/
> -----------------------------------------------------------
> 
> (Updated May 20, 2014, 10:22 p.m.)
> 
> 
> Review request for Baloo and Vishesh Handa.
> 
> 
> Repository: baloo
> 
> 
> Description
> -------
> 
> A scheduler for baloo: 
> * delays the indexing until no new item has been added for at least 5 \
>                 seconds to avoid indexing during a collection sync.
> * remembers if it failed to index something and triggers recovery path on \
>                 next start.
> * supports manual triggering of recovery path if required.
> 
> 
> Diffs
> -----
> 
> src/pim/agent/CMakeLists.txt e917915a3414738595caea5497859ef4810ec44c 
> src/pim/agent/agent.h 1dbf0fc0a16d0615dbfa4878706359bb687facd0 
> src/pim/agent/agent.cpp 8904d49d3579b58b634d2570fbcc8007e5ee41ed 
> src/pim/agent/collectionindexingjob.h PRE-CREATION 
> src/pim/agent/collectionindexingjob.cpp PRE-CREATION 
> src/pim/agent/index.h PRE-CREATION 
> src/pim/agent/index.cpp PRE-CREATION 
> src/pim/agent/scheduler.h PRE-CREATION 
> src/pim/agent/scheduler.cpp PRE-CREATION 
> 
> Diff: https://git.reviewboard.kde.org/r/118231/diff/
> 
> 
> Testing
> -------
> 
> I'm running it for a while, and it reduced the stress that baloo imposed \
> on my system and all my mails are indexed since I'm using it (wasn't the \
> case before). 
> 
> Thanks,
> 
> Christian Mollekopf
> 
> 


[Attachment #5 (text/html)]

<html>
 <body>
  <div style="font-family: Verdana, Arial, Helvetica, Sans-Serif;">
   <table bgcolor="#f9f3c9" width="100%" cellpadding="8" style="border: 1px \
#c9c399 solid;">  <tr>
     <td>
      This is an automatically generated e-mail. To reply, visit:
      <a href="https://git.reviewboard.kde.org/r/118231/">https://git.reviewboard.kde.org/r/118231/</a>
  </td>
    </tr>
   </table>
   <br />










<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; \
padding-left: 10px;">  <p style="margin-top: 0;">On May 22nd, 2014, 4 p.m. \
UTC, <b>Vishesh Handa</b> wrote:</p>  <blockquote style="margin-left: 1em; \
border-left: 2px solid #d0d0d0; padding-left: 10px;">  



<table width="100%" border="0" bgcolor="white" style="border: 1px solid \
#C0C0C0; border-collapse: collapse; margin: 2px padding: 2px;">  <thead>
  <tr>
   <th colspan="4" bgcolor="#F0F0F0" style="border-bottom: 1px solid \
#C0C0C0; font-size: 9pt; padding: 4px 8px; text-align: left;">  <a \
href="https://git.reviewboard.kde.org/r/118231/diff/1/?file=273888#file273888line66" \
style="color: black; font-weight: bold; text-decoration: \
underline;">src/pim/agent/collectionindexingjob.h</a>  <span \
style="font-weight: normal;">

     (Diff revision 1)

    </span>
   </th>
  </tr>
 </thead>



 
 

 <tbody>

  <tr>
    <th bgcolor="#b1ebb0" style="border-right: 1px solid #C0C0C0;" \
align="right"><font size="2"></font></th>  <td bgcolor="#c5ffc4" \
width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; \
"></pre></td>  <th bgcolor="#b1ebb0" style="border-left: 1px solid #C0C0C0; \
border-right: 1px solid #C0C0C0;" align="right"><font \
size="2">66</font></th>  <td bgcolor="#c5ffc4" width="50%"><pre \
style="font-size: 8pt; line-height: 140%; margin: 0; ">    <span \
class="kt">bool</span> <span class="n">m_indexedUnindexed</span><span \
class="p">;</span></pre></td>  </tr>

 </tbody>

</table>

  <pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; \
white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">Whoa. This variable name is very very confusing. It&#39;s \
true, if there are indexed items which are unindexed? :O</pre>  \
</blockquote>





</blockquote>
<pre style="margin-left: 1em; white-space: pre-wrap; white-space: \
-moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">Right, it&#39;s a lock to not repeatedly index not-yet-indexed \
items. It&#39;s necessary so we don&#39;t end up in an endless loop should \
we fail to index some items (for whatever reason). I&#39;ll think of a \
better name.</pre> <br />

<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; \
padding-left: 10px;">  <p style="margin-top: 0;">On May 22nd, 2014, 4 p.m. \
UTC, <b>Vishesh Handa</b> wrote:</p>  <blockquote style="margin-left: 1em; \
border-left: 2px solid #d0d0d0; padding-left: 10px;">  



<table width="100%" border="0" bgcolor="white" style="border: 1px solid \
#C0C0C0; border-collapse: collapse; margin: 2px padding: 2px;">  <thead>
  <tr>
   <th colspan="4" bgcolor="#F0F0F0" style="border-bottom: 1px solid \
#C0C0C0; font-size: 9pt; padding: 4px 8px; text-align: left;">  <a \
href="https://git.reviewboard.kde.org/r/118231/diff/1/?file=273889#file273889line200" \
style="color: black; font-weight: bold; text-decoration: \
underline;">src/pim/agent/collectionindexingjob.cpp</a>  <span \
style="font-weight: normal;">

     (Diff revision 1)

    </span>
   </th>
  </tr>
 </thead>



 
 

 <tbody>

  <tr>
    <th bgcolor="#b1ebb0" style="border-right: 1px solid #C0C0C0;" \
align="right"><font size="2"></font></th>  <td bgcolor="#c5ffc4" \
width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; \
"></pre></td>  <th bgcolor="#b1ebb0" style="border-left: 1px solid #C0C0C0; \
border-right: 1px solid #C0C0C0;" align="right"><font \
size="2">200</font></th>  <td bgcolor="#c5ffc4" width="50%"><pre \
style="font-size: 8pt; line-height: 140%; margin: 0; ">    <span \
class="k">if</span> <span class="p">(</span><span class="o">!</span><span \
class="n">m_indexedItems</span><span class="p">.</span><span \
class="n">isEmpty</span><span class="p">())</span> <span \
class="p">{</span></pre></td>  </tr>

 </tbody>

</table>

  <pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; \
white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">I&#39;m sorry. I&#39;m a little confused as to how this could \
ever happen unless Akonadi is messed up and does not reliably send item \
removed notifications?</pre>  </blockquote>





</blockquote>
<pre style="margin-left: 1em; white-space: pre-wrap; white-space: \
-moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">The whole point of this codepath is to recover when we miss \
something. There is no guarantee that we manage to process all signals (the \
akonadi server just fires a dbus signal and hopes for the best), so \
it&#39;s entirely possible that we end up in this state (and it has in fact \
happened to me). And if you clear the cache (for whatever reason), you \
definitely end up in this state.</pre> <br />

<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; \
padding-left: 10px;">  <p style="margin-top: 0;">On May 22nd, 2014, 4 p.m. \
UTC, <b>Vishesh Handa</b> wrote:</p>  <blockquote style="margin-left: 1em; \
border-left: 2px solid #d0d0d0; padding-left: 10px;">  



<table width="100%" border="0" bgcolor="white" style="border: 1px solid \
#C0C0C0; border-collapse: collapse; margin: 2px padding: 2px;">  <thead>
  <tr>
   <th colspan="4" bgcolor="#F0F0F0" style="border-bottom: 1px solid \
#C0C0C0; font-size: 9pt; padding: 4px 8px; text-align: left;">  <a \
href="https://git.reviewboard.kde.org/r/118231/diff/1/?file=273890#file273890line33" \
style="color: black; font-weight: bold; text-decoration: \
underline;">src/pim/agent/index.h</a>  <span style="font-weight: normal;">

     (Diff revision 1)

    </span>
   </th>
  </tr>
 </thead>



 
 

 <tbody>

  <tr>
    <th bgcolor="#b1ebb0" style="border-right: 1px solid #C0C0C0;" \
align="right"><font size="2"></font></th>  <td bgcolor="#c5ffc4" \
width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; \
"></pre></td>  <th bgcolor="#b1ebb0" style="border-left: 1px solid #C0C0C0; \
border-right: 1px solid #C0C0C0;" align="right"><font \
size="2">33</font></th>  <td bgcolor="#c5ffc4" width="50%"><pre \
style="font-size: 8pt; line-height: 140%; margin: 0; "><span \
class="n">class</span> <span class="n">Index</span> <span \
class="o">:</span> <span class="n">public</span> <span \
class="n">QObject</span></pre></td>  </tr>

 </tbody>

</table>

  <pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; \
white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">I&#39;m not too happy with this classes name. How about \
&quot;Indexer&quot; instead? But then it would clash with \
&quot;AbstractIndexer&quot;.</pre>  </blockquote>





</blockquote>
<pre style="margin-left: 1em; white-space: pre-wrap; white-space: \
-moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">I&#39;m open to other suggestions, but IMO Index is quite apt. \
The Index class encapsulates the &quot;index&quot; we&#39;re working \
against (which is actually consisting of many small indexes, but that&#39;s \
hidden). Indexer is imo suitable for the code we have that processes an \
item/collection. The idea with the &quot;index&quot; is that you &quot;put \
an item in the index&quot;, and what internally happens or where exactly \
this information ends up is irrelevant. Or the other way around, you query \
the index (not the indexer). So, what&#39;s wrong with Index? ;-)</pre>
<br />

<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; \
padding-left: 10px;">  <p style="margin-top: 0;">On May 22nd, 2014, 4 p.m. \
UTC, <b>Vishesh Handa</b> wrote:</p>  <blockquote style="margin-left: 1em; \
border-left: 2px solid #d0d0d0; padding-left: 10px;">  



<table width="100%" border="0" bgcolor="white" style="border: 1px solid \
#C0C0C0; border-collapse: collapse; margin: 2px padding: 2px;">  <thead>
  <tr>
   <th colspan="4" bgcolor="#F0F0F0" style="border-bottom: 1px solid \
#C0C0C0; font-size: 9pt; padding: 4px 8px; text-align: left;">  <a \
href="https://git.reviewboard.kde.org/r/118231/diff/1/?file=273891#file273891line128" \
style="color: black; font-weight: bold; text-decoration: \
underline;">src/pim/agent/index.cpp</a>  <span style="font-weight: \
normal;">

     (Diff revision 1)

    </span>
   </th>
  </tr>
 </thead>



 
 

 <tbody>

  <tr>
    <th bgcolor="#b1ebb0" style="border-right: 1px solid #C0C0C0;" \
align="right"><font size="2"></font></th>  <td bgcolor="#c5ffc4" \
width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; \
"></pre></td>  <th bgcolor="#b1ebb0" style="border-left: 1px solid #C0C0C0; \
border-right: 1px solid #C0C0C0;" align="right"><font \
size="2">128</font></th>  <td bgcolor="#c5ffc4" width="50%"><pre \
style="font-size: 8pt; line-height: 140%; margin: 0; "><span \
class="kt">void</span> <span class="n">Index</span><span \
class="o">::</span><span class="n">reindex</span><span \
class="p">(</span><span class="k">const</span> <span \
class="n">Akonadi</span><span class="o">::</span><span \
class="n">Item</span><span class="o">&amp;</span> <span \
class="n">item</span><span class="p">)</span></pre></td>  </tr>

 </tbody>

</table>

  <pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; \
white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">Is this required? From a Xapian point of view you&#39;re \
spending an extra amount of time first removing the data and then adding it \
back again.

If you just index the item, Xapian will internally do a diff on the terms \
that have changed, and then just update those.</pre>  </blockquote>





</blockquote>
<pre style="margin-left: 1em; white-space: pre-wrap; white-space: \
-moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">The original code was like that AFAIK so I kept it that way, \
but I&#39;ll change it then.</pre> <br />

<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; \
padding-left: 10px;">  <p style="margin-top: 0;">On May 22nd, 2014, 4 p.m. \
UTC, <b>Vishesh Handa</b> wrote:</p>  <blockquote style="margin-left: 1em; \
border-left: 2px solid #d0d0d0; padding-left: 10px;">  



<table width="100%" border="0" bgcolor="white" style="border: 1px solid \
#C0C0C0; border-collapse: collapse; margin: 2px padding: 2px;">  <thead>
  <tr>
   <th colspan="4" bgcolor="#F0F0F0" style="border-bottom: 1px solid \
#C0C0C0; font-size: 9pt; padding: 4px 8px; text-align: left;">  <a \
href="https://git.reviewboard.kde.org/r/118231/diff/1/?file=273892#file273892line66" \
style="color: black; font-weight: bold; text-decoration: \
underline;">src/pim/agent/scheduler.h</a>  <span style="font-weight: \
normal;">

     (Diff revision 1)

    </span>
   </th>
  </tr>
 </thead>



 
 

 <tbody>

  <tr>
    <th bgcolor="#b1ebb0" style="border-right: 1px solid #C0C0C0;" \
align="right"><font size="2"></font></th>  <td bgcolor="#c5ffc4" \
width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; \
"></pre></td>  <th bgcolor="#b1ebb0" style="border-left: 1px solid #C0C0C0; \
border-right: 1px solid #C0C0C0;" align="right"><font \
size="2">66</font></th>  <td bgcolor="#c5ffc4" width="50%"><pre \
style="font-size: 8pt; line-height: 140%; margin: 0; ">    <span \
class="n">QMap</span><span class="o">&lt;</span><span \
class="n">Akonadi</span><span class="o">::</span><span \
class="n">Collection</span><span class="o">::</span><span \
class="n">Id</span><span class="p">,</span> <span \
class="n">QSharedPointer</span><span class="o">&lt;</span><span \
class="n">QQueue</span><span class="o">&lt;</span><span \
class="n">Akonadi</span><span class="o">::</span><span \
class="n">Item</span><span class="o">::</span><span \
class="n">Id</span><span class="o">&gt;</span> <span class="o">&gt;</span> \
<span class="o">&gt;</span> <span class="n">m_queues</span><span \
class="p">;</span></pre></td>  </tr>

 </tbody>

</table>

  <pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; \
white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">This is confusing. Perhaps some more documentation?

Also, couldn&#39;t you just directly do a

QMap&lt;Akonadi::Collection::ID, QQueue&lt;Akonadi::Item::ID&gt;&gt;

I&#39;m not sure what the shared pointer is doing.</pre>
 </blockquote>





</blockquote>
<pre style="margin-left: 1em; white-space: pre-wrap; white-space: \
-moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">I think I overlooked the [] operator that allows me to access \
the queue by reference, and I wasn&#39;t sure whether QMap copies the \
values internally, but I suppose it shouldn&#39;t. The shared pointer is \
just for automatic lifetime management (no leaks guaranteed).

I suppose I can change that to just use a QQueue as you suggested.</pre>
<br />

<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; \
padding-left: 10px;">  <p style="margin-top: 0;">On May 22nd, 2014, 4 p.m. \
UTC, <b>Vishesh Handa</b> wrote:</p>  <blockquote style="margin-left: 1em; \
border-left: 2px solid #d0d0d0; padding-left: 10px;">  



<table width="100%" border="0" bgcolor="white" style="border: 1px solid \
#C0C0C0; border-collapse: collapse; margin: 2px padding: 2px;">  <thead>
  <tr>
   <th colspan="4" bgcolor="#F0F0F0" style="border-bottom: 1px solid \
#C0C0C0; font-size: 9pt; padding: 4px 8px; text-align: left;">  <a \
href="https://git.reviewboard.kde.org/r/118231/diff/1/?file=273893#file273893line36" \
style="color: black; font-weight: bold; text-decoration: \
underline;">src/pim/agent/scheduler.cpp</a>  <span style="font-weight: \
normal;">

     (Diff revision 1)

    </span>
   </th>
  </tr>
 </thead>



 
 

 <tbody>

  <tr>
    <th bgcolor="#b1ebb0" style="border-right: 1px solid #C0C0C0;" \
align="right"><font size="2"></font></th>  <td bgcolor="#c5ffc4" \
width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; \
"></pre></td>  <th bgcolor="#b1ebb0" style="border-left: 1px solid #C0C0C0; \
border-right: 1px solid #C0C0C0;" align="right"><font \
size="2">36</font></th>  <td bgcolor="#c5ffc4" width="50%"><pre \
style="font-size: 8pt; line-height: 140%; margin: 0; ">    <span \
class="n">m_processTimer</span><span class="p">.</span><span \
class="n">setInterval</span><span class="p">(</span><span \
class="mi">100</span><span class="p">);</span></pre></td>  </tr>

 </tbody>

</table>

  <pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; \
white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">Why 100?</pre>  </blockquote>





</blockquote>
<pre style="margin-left: 1em; white-space: pre-wrap; white-space: \
-moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">Random value for event compression.</pre> <br />

<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; \
padding-left: 10px;">  <p style="margin-top: 0;">On May 22nd, 2014, 4 p.m. \
UTC, <b>Vishesh Handa</b> wrote:</p>  <blockquote style="margin-left: 1em; \
border-left: 2px solid #d0d0d0; padding-left: 10px;">  



<table width="100%" border="0" bgcolor="white" style="border: 1px solid \
#C0C0C0; border-collapse: collapse; margin: 2px padding: 2px;">  <thead>
  <tr>
   <th colspan="4" bgcolor="#F0F0F0" style="border-bottom: 1px solid \
#C0C0C0; font-size: 9pt; padding: 4px 8px; text-align: left;">  <a \
href="https://git.reviewboard.kde.org/r/118231/diff/1/?file=273893#file273893line46" \
style="color: black; font-weight: bold; text-decoration: \
underline;">src/pim/agent/scheduler.cpp</a>  <span style="font-weight: \
normal;">

     (Diff revision 1)

    </span>
   </th>
  </tr>
 </thead>



 
 

 <tbody>

  <tr>
    <th bgcolor="#b1ebb0" style="border-right: 1px solid #C0C0C0;" \
align="right"><font size="2"></font></th>  <td bgcolor="#c5ffc4" \
width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; \
"></pre></td>  <th bgcolor="#b1ebb0" style="border-left: 1px solid #C0C0C0; \
border-right: 1px solid #C0C0C0;" align="right"><font \
size="2">46</font></th>  <td bgcolor="#c5ffc4" width="50%"><pre \
style="font-size: 8pt; line-height: 140%; margin: 0; ">        <span \
class="n">scheduleCollection</span><span class="p">(</span><span \
class="n">Akonadi</span><span class="o">::</span><span \
class="n">Collection</span><span class="p">(</span><span \
class="n">col</span><span class="p">),</span> <span \
class="nb">true</span><span class="p">);</span></pre></td>  </tr>

 </tbody>

</table>

  <pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; \
white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">This is quite dangerous. If an email is not indexed it results \
in the entire collection being sync and all a full collection fetch job \
going on.

I&#39;m very much against this. I get enough angry emails about how the \
baloo indexer is sucking all their cpu.</pre>  </blockquote>





</blockquote>
<pre style="margin-left: 1em; white-space: pre-wrap; white-space: \
-moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">Either this or you only have half your emails indexed. Note \
that this codepath is only triggered if we fail to index items in a timely \
fashion, and since indexing is fairly fast that should seldomly be the \
case, right?</pre> <br />

<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; \
padding-left: 10px;">  <p style="margin-top: 0;">On May 22nd, 2014, 4 p.m. \
UTC, <b>Vishesh Handa</b> wrote:</p>  <blockquote style="margin-left: 1em; \
border-left: 2px solid #d0d0d0; padding-left: 10px;">  



<table width="100%" border="0" bgcolor="white" style="border: 1px solid \
#C0C0C0; border-collapse: collapse; margin: 2px padding: 2px;">  <thead>
  <tr>
   <th colspan="4" bgcolor="#F0F0F0" style="border-bottom: 1px solid \
#C0C0C0; font-size: 9pt; padding: 4px 8px; text-align: left;">  <a \
href="https://git.reviewboard.kde.org/r/118231/diff/1/?file=273893#file273893line53" \
style="color: black; font-weight: bold; text-decoration: \
underline;">src/pim/agent/scheduler.cpp</a>  <span style="font-weight: \
normal;">

     (Diff revision 1)

    </span>
   </th>
  </tr>
 </thead>



 
 

 <tbody>

  <tr>
    <th bgcolor="#b1ebb0" style="border-right: 1px solid #C0C0C0;" \
align="right"><font size="2"></font></th>  <td bgcolor="#c5ffc4" \
width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; \
"></pre></td>  <th bgcolor="#b1ebb0" style="border-left: 1px solid #C0C0C0; \
border-right: 1px solid #C0C0C0;" align="right"><font \
size="2">53</font></th>  <td bgcolor="#c5ffc4" width="50%"><pre \
style="font-size: 8pt; line-height: 140%; margin: 0; ">    <span \
class="n">group</span><span class="p">.</span><span \
class="n">writeEntry</span><span class="p">(</span><span \
class="s">&quot;initialIndexing&quot;</span><span class="p">,</span> <span \
class="nb">true</span><span class="p">);</span></pre></td>  </tr>

 </tbody>

</table>

  <pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; \
white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">So, even if someone switched off their system before the \
initial indexing was done, we mark the initial indexing as completed?</pre> \
</blockquote>





</blockquote>
<pre style="margin-left: 1em; white-space: pre-wrap; white-space: \
-moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">The not-yet completed collections are remembered as part of \
the &quot;dirty collections&quot; which should take care of this.</pre> <br \
/>

<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; \
padding-left: 10px;">  <p style="margin-top: 0;">On May 22nd, 2014, 4 p.m. \
UTC, <b>Vishesh Handa</b> wrote:</p>  <blockquote style="margin-left: 1em; \
border-left: 2px solid #d0d0d0; padding-left: 10px;">  



<table width="100%" border="0" bgcolor="white" style="border: 1px solid \
#C0C0C0; border-collapse: collapse; margin: 2px padding: 2px;">  <thead>
  <tr>
   <th colspan="4" bgcolor="#F0F0F0" style="border-bottom: 1px solid \
#C0C0C0; font-size: 9pt; padding: 4px 8px; text-align: left;">  <a \
href="https://git.reviewboard.kde.org/r/118231/diff/1/?file=273893#file273893line97" \
style="color: black; font-weight: bold; text-decoration: \
underline;">src/pim/agent/scheduler.cpp</a>  <span style="font-weight: \
normal;">

     (Diff revision 1)

    </span>
   </th>
  </tr>
 </thead>



 
 

 <tbody>

  <tr>
    <th bgcolor="#b1ebb0" style="border-right: 1px solid #C0C0C0;" \
align="right"><font size="2"></font></th>  <td bgcolor="#c5ffc4" \
width="50%"><pre style="font-size: 8pt; line-height: 140%; margin: 0; \
"></pre></td>  <th bgcolor="#b1ebb0" style="border-left: 1px solid #C0C0C0; \
border-right: 1px solid #C0C0C0;" align="right"><font \
size="2">97</font></th>  <td bgcolor="#c5ffc4" width="50%"><pre \
style="font-size: 8pt; line-height: 140%; margin: 0; "><span \
class="kt">void</span> <span class="n">Scheduler</span><span \
class="o">::</span><span class="n">addItem</span><span \
class="p">(</span><span class="k">const</span> <span \
class="n">Akonadi</span><span class="o">::</span><span \
class="n">Item</span> <span class="o">&amp;</span><span \
class="n">item</span><span class="p">)</span></pre></td>  </tr>

 </tbody>

</table>

  <pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; \
white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">So each time an item is added, you pass the parent collection \
to the CollectionIndexingJob and there you fetch the entire collection \
(+statistics) and do a query on Xapian to check how many items are already \
indexed?

Arguably, the xapian query would be quite fast, but still.</pre>
 </blockquote>





</blockquote>
<pre style="margin-left: 1em; white-space: pre-wrap; white-space: \
-moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">That could be optimized as we only require up-to date \
statistics for a full sync, but I really doubt that fetching a collection \
will have any noticeable performance impact. </pre> <br />




<p>- Christian</p>


<br />
<p>On May 20th, 2014, 10:22 p.m. UTC, Christian Mollekopf wrote:</p>








<table bgcolor="#fefadf" width="100%" cellspacing="0" cellpadding="8" \
style="background-image: \
url('https://git.reviewboard.kde.org/static/rb/images/review_request_box_top_bg.ab6f3b1072c9.png'); \
background-position: left top; background-repeat: repeat-x; border: 1px \
black solid;">  <tr>
  <td>

<div>Review request for Baloo and Vishesh Handa.</div>
<div>By Christian Mollekopf.</div>


<p style="color: grey;"><i>Updated May 20, 2014, 10:22 p.m.</i></p>









<div style="margin-top: 1.5em;">
 <b style="color: #575012; font-size: 10pt;">Repository: </b>
baloo
</div>


<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Description \
</h1>  <table width="100%" bgcolor="#ffffff" cellspacing="0" \
cellpadding="10" style="border: 1px solid #b8b5a0">  <tr>
  <td>
   <pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: \
-moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
                break-word;">A scheduler for baloo: 
* delays the indexing until no new item has been added for at least 5 \
                seconds to avoid indexing during a collection sync.
* remembers if it failed to index something and triggers recovery path on \
                next start.
* supports manual triggering of recovery path if required.</pre>
  </td>
 </tr>
</table>


<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Testing \
</h1> <table width="100%" bgcolor="#ffffff" cellspacing="0" \
cellpadding="10" style="border: 1px solid #b8b5a0">  <tr>
  <td>
   <pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: \
-moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">I&#39;m running it for a while, and it reduced the stress that \
baloo imposed on my system and all my mails are indexed since I&#39;m using \
it (wasn&#39;t the case before).</pre>  </td>
 </tr>
</table>


<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Diffs</b> \
</h1> <ul style="margin-left: 3em; padding-left: 0;">

 <li>src/pim/agent/CMakeLists.txt <span style="color: \
grey">(e917915a3414738595caea5497859ef4810ec44c)</span></li>

 <li>src/pim/agent/agent.h <span style="color: \
grey">(1dbf0fc0a16d0615dbfa4878706359bb687facd0)</span></li>

 <li>src/pim/agent/agent.cpp <span style="color: \
grey">(8904d49d3579b58b634d2570fbcc8007e5ee41ed)</span></li>

 <li>src/pim/agent/collectionindexingjob.h <span style="color: \
grey">(PRE-CREATION)</span></li>

 <li>src/pim/agent/collectionindexingjob.cpp <span style="color: \
grey">(PRE-CREATION)</span></li>

 <li>src/pim/agent/index.h <span style="color: \
grey">(PRE-CREATION)</span></li>

 <li>src/pim/agent/index.cpp <span style="color: \
grey">(PRE-CREATION)</span></li>

 <li>src/pim/agent/scheduler.h <span style="color: \
grey">(PRE-CREATION)</span></li>

 <li>src/pim/agent/scheduler.cpp <span style="color: \
grey">(PRE-CREATION)</span></li>

</ul>

<p><a href="https://git.reviewboard.kde.org/r/118231/diff/" \
style="margin-left: 3em;">View Diff</a></p>







  </td>
 </tr>
</table>








  </div>
 </body>
</html>



>> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to unsubscribe <<


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic