[prev in list] [next in list] [prev in thread] [next in thread] 

List:       nepomuk
Subject:    Re: [Nepomuk] Review Request 111274: Add positional information to Nepomuk2::Query::Term
From:       "Sebastian Trueg" <sebastian () trueg ! de>
Date:       2013-09-12 7:08:31
Message-ID: 20130912070831.18928.19818 () vidsolbach ! de
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


> On July 29, 2013, 4:04 p.m., Vishesh Handa wrote:
> > The main thing that I dislike about this patch is the fact that it adds methods \
> > that are very specific to a certain use case in a public API. Do you think you \
> > could just make it work with a setUserData(void* ) or something similar? That way \
> > anyone who needs to store additional info in a Term can use that.
> 
> Denis Steckelmacher wrote:
> It is a good idea, but how do we handle the case of several bits of code that each \
> want to call setUserData ? For instance, a library can get a Nepomuk2::Query parsed \
> using the new query parser (that has called setUserData for every term), and it may \
> also want to store something. This problem can be solved by using setUserData(const \
> QString &key, void *), but this would be slow and heavy. 
> There is also the case of code that want to access the positional information. If \
> the code is in Nepomuk, it will have to include a private header describing the \
> user data format used by the parser (and this format will have to be \
> binary-compatible between versions if user can update nepomuk-widgets without \
> updating nepomuk-core, or vice versa). 
> Another possible solution is to keep the positional information in TermPrivate, and \
> to put a "friend class Nepomuk2::QueryBuilder" at the top of Term. This way, there \
> is nothing exposed in the public API, and QueryBuilder can access the positional \
> information through the d-pointer of Term. 
> Denis Steckelmacher wrote:
> Further reflection made me think about two possibles ways to enhance this patch :
> 
> * Add three methods : bool setUserData(TermUserData *), void clearUserData() and \
> TermUserData *userData(). The goal is to prevent an application to overwrite the \
> user data of a library by putting checks against that : setUserData will only \
> succeed if there were previously no user data associated with the term. \
> clearUserData removes the user data of the term (and deletes it, it is the reason \
> why a TermUserData class having a virtual destructor is used). userData simply \
>                 returns the user data currently associated with the term.
> * Add two methods : void setUserData(const QString &key, void *) and void \
> *userData(const QString &key). Here, a string key is used to disambiguate the user \
> data of different users of the Term class. This would require to add a \
> QHash<QString, void *> in TermPrivate. 
> Personally, I prefer the first solution, as it is more lightweight that the second \
> one (TermPrivate still contains only a pointer). I think it can be used because it \
> will be very rare that more than one user of the Term class will need to store user \
> data in it. If it is the case, then one of the users can find a way to be \
> compatible with the other (for instance, user A can replace its user data with the \
> one of user B just before calling a user B method on the term). 
> What do you think ?
> 
> Sebastian Trueg wrote:
> How about setUserData(const QString& name, const QVariant& data)?
> Seems more Qt'ish and allows for multiple additions of user data.
> 
> Denis Steckelmacher wrote:
> That's nice, but is it okay to add a QHash<QString, QVariant> in TermPrivate ? This \
> class currently is lightweight and small, and I don't know if there are \
> applications that use thousands of Terms and need them to be as small as possible.

AFAIK QHash is pretty lightweight also, it is not? It is implicitely shared, meaning \
that an empty hash is basically just a pointer?


- Sebastian


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://git.reviewboard.kde.org/r/111274/#review36754
-----------------------------------------------------------


On June 27, 2013, 6:45 p.m., Denis Steckelmacher wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://git.reviewboard.kde.org/r/111274/
> -----------------------------------------------------------
> 
> (Updated June 27, 2013, 6:45 p.m.)
> 
> 
> Review request for Nepomuk.
> 
> 
> Description
> -------
> 
> This patch adds two fields to Nepomuk2::Query::Term: "position" and "length". By \
> default, they are zero and everything works as before. This change is binary- and \
> source-compatible. 
> The in-development Nepomuk query parser (GSoC 2013 project) uses these fields to \
> map parsed terms to the original user query string. By traversing the parse tree (a \
> tree of And, Or and Comparison terms), it is therefore possible to match every term \
> with a piece of input, that can be highlighted accordingly. For instance, each \
> ComparisonTerm can have a different color, its subTerm being rendered using a bold \
> font (like in "sent to *Jimmy*"). 
> This patch also makes the QuerySerializer serialize the positional information, for \
> debug purposes. This information is not read-back by the deserializer, as the \
> modification would be fairly intrusive, and there is no need to deserialize \
> positional information if the original user query is lost. 
> 
> Diffs
> -----
> 
> libnepomukcore/query/queryserializer.cpp 7ac6131 
> libnepomukcore/query/term.h 114baa3 
> libnepomukcore/query/term.cpp cfb41f4 
> libnepomukcore/query/term_p.h 6d92b48 
> 
> Diff: http://git.reviewboard.kde.org/r/111274/diff/
> 
> 
> Testing
> -------
> 
> The query parser (https://github.com/steckdenis/nepomukqueryparser) is able to \
> store positional information into terms, and this information is valid. The query \
> "files related to mails sent by John, dated of last week" produces the following \
> term tree: 
> <and position="0" length="55">
> <type position="0" length="5" uri="FileDataObject"/>
> <comparison position="6" length="30" property="relatedTo" comparator="=" \
> inverted="false"> <and position="0" length="35">
> <type position="17" length="5" uri="Message"/>
> <comparison position="23" length="12" property="messageFrom" comparator=":" \
> inverted="false"> <literal position="31" length="4" datatype="string">
> John
> </literal>
> </comparison>
> </and>
> </comparison>
> <and position="46" length="9">
> <comparison position="46" length="9" property="fileLastModified" comparator="&gt;=" \
> inverted="false"> <literal position="46" length="9" datatype="dateTime">
> 2013-06-16T22:00:00.002Z
> </literal>
> </comparison>
> <comparison position="46" length="9" property="fileLastModified" comparator="&lt;=" \
> inverted="false"> <literal position="46" length="9" datatype="dateTime">
> 2013-06-23T20:00:00.002Z
> </literal>
> </comparison>
> </and>
> </and>
> 
> 
> Thanks,
> 
> Denis Steckelmacher
> 
> 


[Attachment #5 (text/html)]

<html>
 <body>
  <div style="font-family: Verdana, Arial, Helvetica, Sans-Serif;">
   <table bgcolor="#f9f3c9" width="100%" cellpadding="8" style="border: 1px #c9c399 \
solid;">  <tr>
     <td>
      This is an automatically generated e-mail. To reply, visit:
      <a href="http://git.reviewboard.kde.org/r/111274/">http://git.reviewboard.kde.org/r/111274/</a>
  </td>
    </tr>
   </table>
   <br />





<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: \
10px;">  <p style="margin-top: 0;">On July 29th, 2013, 4:04 p.m. UTC, <b>Vishesh \
Handa</b> wrote:</p>  <blockquote style="margin-left: 1em; border-left: 2px solid \
#d0d0d0; padding-left: 10px;">  <pre style="white-space: pre-wrap; white-space: \
-moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">The main thing that I dislike about this patch is the fact that it adds \
methods that are very specific to a certain use case in a public API. Do you think \
you could just make it work with a setUserData(void* ) or something similar? That way \
anyone who needs to store additional info in a Term can use that.</pre>  \
</blockquote>




 <p>On July 29th, 2013, 5:12 p.m. UTC, <b>Denis Steckelmacher</b> wrote:</p>
 <blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: \
10px;">  <pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: \
-pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">It is a good idea, but \
how do we handle the case of several bits of code that each want to call setUserData \
? For instance, a library can get a Nepomuk2::Query parsed using the new query parser \
(that has called setUserData for every term), and it may also want to store \
something. This problem can be solved by using setUserData(const QString &amp;key, \
void *), but this would be slow and heavy.

There is also the case of code that want to access the positional information. If the \
code is in Nepomuk, it will have to include a private header describing the user data \
format used by the parser (and this format will have to be binary-compatible between \
versions if user can update nepomuk-widgets without updating nepomuk-core, or vice \
versa).

Another possible solution is to keep the positional information in TermPrivate, and \
to put a &quot;friend class Nepomuk2::QueryBuilder&quot; at the top of Term. This \
way, there is nothing exposed in the public API, and QueryBuilder can access the \
positional information through the d-pointer of Term.</pre>  </blockquote>





 <p>On September 7th, 2013, 1:06 p.m. UTC, <b>Denis Steckelmacher</b> wrote:</p>
 <blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: \
10px;">  <pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: \
-pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">Further reflection made \
me think about two possibles ways to enhance this patch :

* Add three methods : bool setUserData(TermUserData *), void clearUserData() and \
TermUserData *userData(). The goal is to prevent an application to overwrite the user \
data of a library by putting checks against that : setUserData will only succeed if \
there were previously no user data associated with the term. clearUserData removes \
the user data of the term (and deletes it, it is the reason why a TermUserData class \
having a virtual destructor is used). userData simply returns the user data currently \
                associated with the term.
* Add two methods : void setUserData(const QString &amp;key, void *) and void \
*userData(const QString &amp;key). Here, a string key is used to disambiguate the \
user data of different users of the Term class. This would require to add a \
QHash&lt;QString, void *&gt; in TermPrivate.

Personally, I prefer the first solution, as it is more lightweight that the second \
one (TermPrivate still contains only a pointer). I think it can be used because it \
will be very rare that more than one user of the Term class will need to store user \
data in it. If it is the case, then one of the users can find a way to be compatible \
with the other (for instance, user A can replace its user data with the one of user B \
just before calling a user B method on the term).

What do you think ?</pre>
 </blockquote>





 <p>On September 10th, 2013, 1:55 p.m. UTC, <b>Sebastian Trueg</b> wrote:</p>
 <blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: \
10px;">  <pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: \
-pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">How about \
setUserData(const QString&amp; name, const QVariant&amp; data)? Seems more Qt&#39;ish \
and allows for multiple additions of user data.</pre>  </blockquote>





 <p>On September 11th, 2013, 3:15 p.m. UTC, <b>Denis Steckelmacher</b> wrote:</p>
 <blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: \
10px;">  <pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: \
-pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">That&#39;s nice, but is \
it okay to add a QHash&lt;QString, QVariant&gt; in TermPrivate ? This class currently \
is lightweight and small, and I don&#39;t know if there are applications that use \
thousands of Terms and need them to be as small as possible.</pre>  </blockquote>








</blockquote>

<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: \
-pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">AFAIK QHash is pretty \
lightweight also, it is not? It is implicitely shared, meaning that an empty hash is \
basically just a pointer?</pre> <br />










<p>- Sebastian</p>


<br />
<p>On June 27th, 2013, 6:45 p.m. UTC, Denis Steckelmacher wrote:</p>








<table bgcolor="#fefadf" width="100%" cellspacing="0" cellpadding="8" \
style="background-image: \
url('http://git.reviewboard.kde.org/static/rb/images/review_request_box_top_bg.ab6f3b1072c9.png'); \
background-position: left top; background-repeat: repeat-x; border: 1px black \
solid;">  <tr>
  <td>

<div>Review request for Nepomuk.</div>
<div>By Denis Steckelmacher.</div>


<p style="color: grey;"><i>Updated June 27, 2013, 6:45 p.m.</i></p>






<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Description </h1>
 <table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" \
style="border: 1px solid #b8b5a0">  <tr>
  <td>
   <pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: \
-moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">This patch adds two fields to Nepomuk2::Query::Term: \
&quot;position&quot; and &quot;length&quot;. By default, they are zero and everything \
works as before. This change is binary- and source-compatible.

The in-development Nepomuk query parser (GSoC 2013 project) uses these fields to map \
parsed terms to the original user query string. By traversing the parse tree (a tree \
of And, Or and Comparison terms), it is therefore possible to match every term with a \
piece of input, that can be highlighted accordingly. For instance, each \
ComparisonTerm can have a different color, its subTerm being rendered using a bold \
font (like in &quot;sent to *Jimmy*&quot;).

This patch also makes the QuerySerializer serialize the positional information, for \
debug purposes. This information is not read-back by the deserializer, as the \
modification would be fairly intrusive, and there is no need to deserialize \
positional information if the original user query is lost.</pre>  </td>
 </tr>
</table>


<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Testing </h1>
<table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" style="border: \
1px solid #b8b5a0">  <tr>
  <td>
   <pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: \
-moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">The query parser (https://github.com/steckdenis/nepomukqueryparser) is \
able to store positional information into terms, and this information is valid. The \
query &quot;files related to mails sent by John, dated of last week&quot; produces \
the following term tree:

&lt;and position=&quot;0&quot; length=&quot;55&quot;&gt;
    &lt;type position=&quot;0&quot; length=&quot;5&quot; \
uri=&quot;FileDataObject&quot;/&gt;  &lt;comparison position=&quot;6&quot; \
length=&quot;30&quot; property=&quot;relatedTo&quot; comparator=&quot;=&quot; \
inverted=&quot;false&quot;&gt;  &lt;and position=&quot;0&quot; \
                length=&quot;35&quot;&gt;
            &lt;type position=&quot;17&quot; length=&quot;5&quot; \
                uri=&quot;Message&quot;/&gt;
            &lt;comparison position=&quot;23&quot; length=&quot;12&quot; \
property=&quot;messageFrom&quot; comparator=&quot;:&quot; \
                inverted=&quot;false&quot;&gt;
                &lt;literal position=&quot;31&quot; length=&quot;4&quot; \
datatype=&quot;string&quot;&gt;  John
                &lt;/literal&gt;
            &lt;/comparison&gt;
        &lt;/and&gt;
    &lt;/comparison&gt;
    &lt;and position=&quot;46&quot; length=&quot;9&quot;&gt;
        &lt;comparison position=&quot;46&quot; length=&quot;9&quot; \
property=&quot;fileLastModified&quot; comparator=&quot;&amp;gt;=&quot; \
                inverted=&quot;false&quot;&gt;
            &lt;literal position=&quot;46&quot; length=&quot;9&quot; \
datatype=&quot;dateTime&quot;&gt;  2013-06-16T22:00:00.002Z
            &lt;/literal&gt;
        &lt;/comparison&gt;
        &lt;comparison position=&quot;46&quot; length=&quot;9&quot; \
property=&quot;fileLastModified&quot; comparator=&quot;&amp;lt;=&quot; \
                inverted=&quot;false&quot;&gt;
            &lt;literal position=&quot;46&quot; length=&quot;9&quot; \
datatype=&quot;dateTime&quot;&gt;  2013-06-23T20:00:00.002Z
            &lt;/literal&gt;
        &lt;/comparison&gt;
    &lt;/and&gt;
&lt;/and&gt;</pre>
  </td>
 </tr>
</table>




<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Diffs</b> </h1>
<ul style="margin-left: 3em; padding-left: 0;">

 <li>libnepomukcore/query/queryserializer.cpp <span style="color: \
grey">(7ac6131)</span></li>

 <li>libnepomukcore/query/term.h <span style="color: grey">(114baa3)</span></li>

 <li>libnepomukcore/query/term.cpp <span style="color: grey">(cfb41f4)</span></li>

 <li>libnepomukcore/query/term_p.h <span style="color: grey">(6d92b48)</span></li>

</ul>

<p><a href="http://git.reviewboard.kde.org/r/111274/diff/" style="margin-left: \
3em;">View Diff</a></p>







  </td>
 </tr>
</table>








  </div>
 </body>
</html>



_______________________________________________
Nepomuk mailing list
Nepomuk@kde.org
https://mail.kde.org/mailman/listinfo/nepomuk


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic