[prev in list] [next in list] [prev in thread] [next in thread]
List: nepomuk
Subject: Re: [Nepomuk] Review Request 111274: Add positional information to Nepomuk2::Query::Term
From: "Sebastian Trueg" <sebastian () trueg ! de>
Date: 2013-09-12 7:08:31
Message-ID: 20130912070831.18928.19818 () vidsolbach ! de
[Download RAW message or body]
[Attachment #2 (multipart/alternative)]
> On July 29, 2013, 4:04 p.m., Vishesh Handa wrote:
> > The main thing that I dislike about this patch is the fact that it adds methods \
> > that are very specific to a certain use case in a public API. Do you think you \
> > could just make it work with a setUserData(void* ) or something similar? That way \
> > anyone who needs to store additional info in a Term can use that.
>
> Denis Steckelmacher wrote:
> It is a good idea, but how do we handle the case of several bits of code that each \
> want to call setUserData ? For instance, a library can get a Nepomuk2::Query parsed \
> using the new query parser (that has called setUserData for every term), and it may \
> also want to store something. This problem can be solved by using setUserData(const \
> QString &key, void *), but this would be slow and heavy.
> There is also the case of code that want to access the positional information. If \
> the code is in Nepomuk, it will have to include a private header describing the \
> user data format used by the parser (and this format will have to be \
> binary-compatible between versions if user can update nepomuk-widgets without \
> updating nepomuk-core, or vice versa).
> Another possible solution is to keep the positional information in TermPrivate, and \
> to put a "friend class Nepomuk2::QueryBuilder" at the top of Term. This way, there \
> is nothing exposed in the public API, and QueryBuilder can access the positional \
> information through the d-pointer of Term.
> Denis Steckelmacher wrote:
> Further reflection made me think about two possibles ways to enhance this patch :
>
> * Add three methods : bool setUserData(TermUserData *), void clearUserData() and \
> TermUserData *userData(). The goal is to prevent an application to overwrite the \
> user data of a library by putting checks against that : setUserData will only \
> succeed if there were previously no user data associated with the term. \
> clearUserData removes the user data of the term (and deletes it, it is the reason \
> why a TermUserData class having a virtual destructor is used). userData simply \
> returns the user data currently associated with the term.
> * Add two methods : void setUserData(const QString &key, void *) and void \
> *userData(const QString &key). Here, a string key is used to disambiguate the user \
> data of different users of the Term class. This would require to add a \
> QHash<QString, void *> in TermPrivate.
> Personally, I prefer the first solution, as it is more lightweight that the second \
> one (TermPrivate still contains only a pointer). I think it can be used because it \
> will be very rare that more than one user of the Term class will need to store user \
> data in it. If it is the case, then one of the users can find a way to be \
> compatible with the other (for instance, user A can replace its user data with the \
> one of user B just before calling a user B method on the term).
> What do you think ?
>
> Sebastian Trueg wrote:
> How about setUserData(const QString& name, const QVariant& data)?
> Seems more Qt'ish and allows for multiple additions of user data.
>
> Denis Steckelmacher wrote:
> That's nice, but is it okay to add a QHash<QString, QVariant> in TermPrivate ? This \
> class currently is lightweight and small, and I don't know if there are \
> applications that use thousands of Terms and need them to be as small as possible.
AFAIK QHash is pretty lightweight also, it is not? It is implicitely shared, meaning \
that an empty hash is basically just a pointer?
- Sebastian
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://git.reviewboard.kde.org/r/111274/#review36754
-----------------------------------------------------------
On June 27, 2013, 6:45 p.m., Denis Steckelmacher wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://git.reviewboard.kde.org/r/111274/
> -----------------------------------------------------------
>
> (Updated June 27, 2013, 6:45 p.m.)
>
>
> Review request for Nepomuk.
>
>
> Description
> -------
>
> This patch adds two fields to Nepomuk2::Query::Term: "position" and "length". By \
> default, they are zero and everything works as before. This change is binary- and \
> source-compatible.
> The in-development Nepomuk query parser (GSoC 2013 project) uses these fields to \
> map parsed terms to the original user query string. By traversing the parse tree (a \
> tree of And, Or and Comparison terms), it is therefore possible to match every term \
> with a piece of input, that can be highlighted accordingly. For instance, each \
> ComparisonTerm can have a different color, its subTerm being rendered using a bold \
> font (like in "sent to *Jimmy*").
> This patch also makes the QuerySerializer serialize the positional information, for \
> debug purposes. This information is not read-back by the deserializer, as the \
> modification would be fairly intrusive, and there is no need to deserialize \
> positional information if the original user query is lost.
>
> Diffs
> -----
>
> libnepomukcore/query/queryserializer.cpp 7ac6131
> libnepomukcore/query/term.h 114baa3
> libnepomukcore/query/term.cpp cfb41f4
> libnepomukcore/query/term_p.h 6d92b48
>
> Diff: http://git.reviewboard.kde.org/r/111274/diff/
>
>
> Testing
> -------
>
> The query parser (https://github.com/steckdenis/nepomukqueryparser) is able to \
> store positional information into terms, and this information is valid. The query \
> "files related to mails sent by John, dated of last week" produces the following \
> term tree:
> <and position="0" length="55">
> <type position="0" length="5" uri="FileDataObject"/>
> <comparison position="6" length="30" property="relatedTo" comparator="=" \
> inverted="false"> <and position="0" length="35">
> <type position="17" length="5" uri="Message"/>
> <comparison position="23" length="12" property="messageFrom" comparator=":" \
> inverted="false"> <literal position="31" length="4" datatype="string">
> John
> </literal>
> </comparison>
> </and>
> </comparison>
> <and position="46" length="9">
> <comparison position="46" length="9" property="fileLastModified" comparator=">=" \
> inverted="false"> <literal position="46" length="9" datatype="dateTime">
> 2013-06-16T22:00:00.002Z
> </literal>
> </comparison>
> <comparison position="46" length="9" property="fileLastModified" comparator="<=" \
> inverted="false"> <literal position="46" length="9" datatype="dateTime">
> 2013-06-23T20:00:00.002Z
> </literal>
> </comparison>
> </and>
> </and>
>
>
> Thanks,
>
> Denis Steckelmacher
>
>
[Attachment #5 (text/html)]
<html>
<body>
<div style="font-family: Verdana, Arial, Helvetica, Sans-Serif;">
<table bgcolor="#f9f3c9" width="100%" cellpadding="8" style="border: 1px #c9c399 \
solid;"> <tr>
<td>
This is an automatically generated e-mail. To reply, visit:
<a href="http://git.reviewboard.kde.org/r/111274/">http://git.reviewboard.kde.org/r/111274/</a>
</td>
</tr>
</table>
<br />
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: \
10px;"> <p style="margin-top: 0;">On July 29th, 2013, 4:04 p.m. UTC, <b>Vishesh \
Handa</b> wrote:</p> <blockquote style="margin-left: 1em; border-left: 2px solid \
#d0d0d0; padding-left: 10px;"> <pre style="white-space: pre-wrap; white-space: \
-moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">The main thing that I dislike about this patch is the fact that it adds \
methods that are very specific to a certain use case in a public API. Do you think \
you could just make it work with a setUserData(void* ) or something similar? That way \
anyone who needs to store additional info in a Term can use that.</pre> \
</blockquote>
<p>On July 29th, 2013, 5:12 p.m. UTC, <b>Denis Steckelmacher</b> wrote:</p>
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: \
10px;"> <pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: \
-pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">It is a good idea, but \
how do we handle the case of several bits of code that each want to call setUserData \
? For instance, a library can get a Nepomuk2::Query parsed using the new query parser \
(that has called setUserData for every term), and it may also want to store \
something. This problem can be solved by using setUserData(const QString &key, \
void *), but this would be slow and heavy.
There is also the case of code that want to access the positional information. If the \
code is in Nepomuk, it will have to include a private header describing the user data \
format used by the parser (and this format will have to be binary-compatible between \
versions if user can update nepomuk-widgets without updating nepomuk-core, or vice \
versa).
Another possible solution is to keep the positional information in TermPrivate, and \
to put a "friend class Nepomuk2::QueryBuilder" at the top of Term. This \
way, there is nothing exposed in the public API, and QueryBuilder can access the \
positional information through the d-pointer of Term.</pre> </blockquote>
<p>On September 7th, 2013, 1:06 p.m. UTC, <b>Denis Steckelmacher</b> wrote:</p>
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: \
10px;"> <pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: \
-pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">Further reflection made \
me think about two possibles ways to enhance this patch :
* Add three methods : bool setUserData(TermUserData *), void clearUserData() and \
TermUserData *userData(). The goal is to prevent an application to overwrite the user \
data of a library by putting checks against that : setUserData will only succeed if \
there were previously no user data associated with the term. clearUserData removes \
the user data of the term (and deletes it, it is the reason why a TermUserData class \
having a virtual destructor is used). userData simply returns the user data currently \
associated with the term.
* Add two methods : void setUserData(const QString &key, void *) and void \
*userData(const QString &key). Here, a string key is used to disambiguate the \
user data of different users of the Term class. This would require to add a \
QHash<QString, void *> in TermPrivate.
Personally, I prefer the first solution, as it is more lightweight that the second \
one (TermPrivate still contains only a pointer). I think it can be used because it \
will be very rare that more than one user of the Term class will need to store user \
data in it. If it is the case, then one of the users can find a way to be compatible \
with the other (for instance, user A can replace its user data with the one of user B \
just before calling a user B method on the term).
What do you think ?</pre>
</blockquote>
<p>On September 10th, 2013, 1:55 p.m. UTC, <b>Sebastian Trueg</b> wrote:</p>
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: \
10px;"> <pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: \
-pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">How about \
setUserData(const QString& name, const QVariant& data)? Seems more Qt'ish \
and allows for multiple additions of user data.</pre> </blockquote>
<p>On September 11th, 2013, 3:15 p.m. UTC, <b>Denis Steckelmacher</b> wrote:</p>
<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: \
10px;"> <pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: \
-pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">That's nice, but is \
it okay to add a QHash<QString, QVariant> in TermPrivate ? This class currently \
is lightweight and small, and I don't know if there are applications that use \
thousands of Terms and need them to be as small as possible.</pre> </blockquote>
</blockquote>
<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: \
-pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">AFAIK QHash is pretty \
lightweight also, it is not? It is implicitely shared, meaning that an empty hash is \
basically just a pointer?</pre> <br />
<p>- Sebastian</p>
<br />
<p>On June 27th, 2013, 6:45 p.m. UTC, Denis Steckelmacher wrote:</p>
<table bgcolor="#fefadf" width="100%" cellspacing="0" cellpadding="8" \
style="background-image: \
url('http://git.reviewboard.kde.org/static/rb/images/review_request_box_top_bg.ab6f3b1072c9.png'); \
background-position: left top; background-repeat: repeat-x; border: 1px black \
solid;"> <tr>
<td>
<div>Review request for Nepomuk.</div>
<div>By Denis Steckelmacher.</div>
<p style="color: grey;"><i>Updated June 27, 2013, 6:45 p.m.</i></p>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Description </h1>
<table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" \
style="border: 1px solid #b8b5a0"> <tr>
<td>
<pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: \
-moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">This patch adds two fields to Nepomuk2::Query::Term: \
"position" and "length". By default, they are zero and everything \
works as before. This change is binary- and source-compatible.
The in-development Nepomuk query parser (GSoC 2013 project) uses these fields to map \
parsed terms to the original user query string. By traversing the parse tree (a tree \
of And, Or and Comparison terms), it is therefore possible to match every term with a \
piece of input, that can be highlighted accordingly. For instance, each \
ComparisonTerm can have a different color, its subTerm being rendered using a bold \
font (like in "sent to *Jimmy*").
This patch also makes the QuerySerializer serialize the positional information, for \
debug purposes. This information is not read-back by the deserializer, as the \
modification would be fairly intrusive, and there is no need to deserialize \
positional information if the original user query is lost.</pre> </td>
</tr>
</table>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Testing </h1>
<table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" style="border: \
1px solid #b8b5a0"> <tr>
<td>
<pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: \
-moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">The query parser (https://github.com/steckdenis/nepomukqueryparser) is \
able to store positional information into terms, and this information is valid. The \
query "files related to mails sent by John, dated of last week" produces \
the following term tree:
<and position="0" length="55">
<type position="0" length="5" \
uri="FileDataObject"/> <comparison position="6" \
length="30" property="relatedTo" comparator="=" \
inverted="false"> <and position="0" \
length="35">
<type position="17" length="5" \
uri="Message"/>
<comparison position="23" length="12" \
property="messageFrom" comparator=":" \
inverted="false">
<literal position="31" length="4" \
datatype="string"> John
</literal>
</comparison>
</and>
</comparison>
<and position="46" length="9">
<comparison position="46" length="9" \
property="fileLastModified" comparator="&gt;=" \
inverted="false">
<literal position="46" length="9" \
datatype="dateTime"> 2013-06-16T22:00:00.002Z
</literal>
</comparison>
<comparison position="46" length="9" \
property="fileLastModified" comparator="&lt;=" \
inverted="false">
<literal position="46" length="9" \
datatype="dateTime"> 2013-06-23T20:00:00.002Z
</literal>
</comparison>
</and>
</and></pre>
</td>
</tr>
</table>
<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Diffs</b> </h1>
<ul style="margin-left: 3em; padding-left: 0;">
<li>libnepomukcore/query/queryserializer.cpp <span style="color: \
grey">(7ac6131)</span></li>
<li>libnepomukcore/query/term.h <span style="color: grey">(114baa3)</span></li>
<li>libnepomukcore/query/term.cpp <span style="color: grey">(cfb41f4)</span></li>
<li>libnepomukcore/query/term_p.h <span style="color: grey">(6d92b48)</span></li>
</ul>
<p><a href="http://git.reviewboard.kde.org/r/111274/diff/" style="margin-left: \
3em;">View Diff</a></p>
</td>
</tr>
</table>
</div>
</body>
</html>
_______________________________________________
Nepomuk mailing list
Nepomuk@kde.org
https://mail.kde.org/mailman/listinfo/nepomuk
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic