[prev in list] [next in list] [prev in thread] [next in thread] 

List:       nepomuk
Subject:    [Nepomuk] Search Problems cause of annoying ontologies
From:       Vishesh Handa <me () vhanda ! in>
Date:       2012-12-18 11:57:32
Message-ID: CAOPTMKAPauoUFBToCw+YT0uabbN21cJD-8KLFCjbe4=pT1auQQ () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


Hey everyone

In Akonadi, they have a very common problem where they need to do a full
text search across a number of properties and find the associated contact.

The properties are -
* nco:fullname
* nco:nameGiven
* nco:nameFamily
* nco:emailAddress

The problem is obviously that a nco:PersonContact unfortunately cannot have
a nco:emailAddress. The EmailAddress must be a resource which then has the
property nco:emailAddress which contains the email.

Theoretically this makes a lot of sense cause an EmailAddress is a
nco:ContactMedium. So one could write a query to iterate all the possible
ways to contact a query, and one would get the email id.

Practically, this sucks. Cause the query requires an extra join + union and
gets slowed down significantly.

select distinct ?r where {
{
   ?r ?p ?o .
   FILTER( ?p in (nco:nameGiven, nco:fullname, nco:nameFamily)  ) .
   ?o bif:contains "whatever" .
}
UNION
{
   ?r nco:hasEmailAddress ?e .
   ?e bif:contains "whatever" .
}

This is a general problem all across Nepomuk where the ontologies (like a
db schema) are fully normalized, and hence require one extra traversal to
go to that object and get its property. In virtuoso this amounts to an
extra join.

Another example is searching for a song given its album, name, and artist's
name. The query is horrible and takes over 18 seconds on my system (yeah,
we are horrible at our main job - searching). Unfortunately, in this case
we have a proper reason for splitting the data. In the Akonadi case there
isn't much of reason.

My suggestion to fix the Akonadi problem is either relaxing the condition
for nco:emailAddress  or double typing the nco:PersonContact as an
nco:EmailAddress. Both of which are very ugly.

Does anyone else have a good solution?

-- 
Vishesh Handa

[Attachment #5 (text/html)]

<div dir="ltr"><div><div><div><div><div><div><div><div><div><div><div><div>Hey \
everyone<br><br></div>In Akonadi, they have a very common problem where they need to \
do a full text search across a number of properties and find the associated \
contact.<br> <br></div>The properties are -<br></div>* nco:fullname<br></div>* \
nco:nameGiven<br></div>* nco:nameFamily<br></div>* nco:emailAddress<br><br></div>The \
problem is obviously that a nco:PersonContact unfortunately cannot have a \
nco:emailAddress. The EmailAddress must be a resource which then has the property \
nco:emailAddress which contains the email.<br> <br></div>Theoretically this makes a \
lot of sense cause an EmailAddress is a nco:ContactMedium. So one could write a query \
to iterate all the possible ways to contact a query, and one would get the email \
id.<br><br></div> Practically, this sucks. Cause the query requires an extra join + \
union and gets slowed down significantly.  <br><br></div>select distinct ?r where { \
<br>{<br></div>   ?r ?p ?o .<br></div><div>   FILTER( ?p in (nco:nameGiven, \
nco:fullname, nco:nameFamily)  ) . <br> </div><div>   ?o bif:contains \
&quot;whatever&quot; .<br>}<br></div><div>UNION<br>{<br></div><div>   ?r \
nco:hasEmailAddress ?e .<br></div><div>   ?e bif:contains &quot;whatever&quot; \
.<br>}<br><br></div><div><div><div><div> \
<div><div><div><div><div><div><div><div><div><div><div>This is a general problem all \
across Nepomuk where the ontologies (like a db schema) are fully normalized, and \
hence require one extra traversal to go to that object and get its property. In \
virtuoso this amounts to an extra join.<br> <br></div><div>Another example is \
searching for a song given its album, name, and artist&#39;s name. The query is \
horrible and takes over 18 seconds on my system (yeah, we are horrible at our main \
job - searching). Unfortunately, in this case we have a proper reason for splitting \
the data. In the Akonadi case there isn&#39;t much of reason.<br> <br></div><div>My \
suggestion to fix the Akonadi problem is either relaxing the condition for \
nco:emailAddress  or double typing the nco:PersonContact as an nco:EmailAddress. Both \
of which are very ugly.<br><br>Does anyone else have a good solution?<br> <br>-- \
<br><span style="color:rgb(192,192,192)">Vishesh Handa</span><br><br> \
</div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div>




_______________________________________________
Nepomuk mailing list
Nepomuk@kde.org
https://mail.kde.org/mailman/listinfo/nepomuk


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic