[prev in list] [next in list] [prev in thread] [next in thread]
List: monetdb-users
Subject: Re: [MonetDB-users] tijah phrase search problem
From: Roy Walter <garliestonhouse () yahoo ! co ! uk>
Date: 2009-08-27 17:06:30
Message-ID: 4A96BD16.1090409 () yahoo ! co ! uk
[Download RAW message or body]
Hi Henning
I didn't compile MonetDB. On my Linux box I used the Ubuntu package and
on my 64-bit Windows box I installed binaries.
Just to add something more to the mix, I reverted to a standard [non
tijah] contains() query on my Windows box, i.e.,
collection("drugtest")//p[contains(., "drug misuse")]
I ran this query on the 532 document collection from within mclient and
the only output to the console was "write error"! If, however, I run the
same query via JDBC I get an appropriate resultset. So here too the
embedded query runs without any problems. The same query on the Linux
box runs and prints OK.
There seems to be a matrix of problems here. My main concern at the
moment is to get phrase searching working correctly from within tijah
queries, preferably under Windows as I bought a new 64-bit box for the
purpose. Although contains() queries perform well on my 64-bit Windows
box, tijah phrase queries have distinct advantages as you know.
-- Roy
Henning Rode wrote:
> hej Roy,
>
> thanks for investigating the problem so thoroughly. i have encountered
> similar errors in the path months. so, before we continue digging for
> the cause of the trouble, i have one question. how did you configure
> MonetDB when you compile it? in my case this error pops up most often
> when i use this settings:
> --enable-bits=64 --enable-oid32
> otherwise i hardly get problems like that.
>
> the strange thing with this bug is, that the embedded tijah-query
> typically runs without any problem, but you only get the error when
> the xquery execution comes to the final printing of results. is that,
> what you observe here as well?
>
> -henning
>
>
> Roy Walter wrote:
>> As a further experiment I shredded the individual documents used to
>> make the large composite into the db on the Linux box. The shredding
>> completed without error, adding 532 documents to the database. After
>> creating the tijah index I ran the following query:
>>
>> tijah:queryall("//p[about(., 'drug misuse')]")
>>
>> and it returned a number of correct results. (The same sequence,
>> i.e., shredding->indexing->querying, on a Windows box produced no
>> results.)
>>
>> Not all results were printed to the console, however, as the query
>> produced the following error:
>>
>> !ERROR: XML Generation: tmpr_1231 BAT does not have a 120 head.
>> ...
>> ERROR = !ERROR:
>> !ERROR: xquery_print_result_main: operation failed.
>>
>> It's possible that the error is memory related as my Jaunty
>> installation is running under VirtualBox.
>>
>> -- Roy
>>
>> Roy Walter wrote:
>>> It's a strange one. I have been experimenting a little.
>>>
>>> I was working with a single large [composite] document (176MB) that
>>> showed the problem. So I took a subset of files containing the
>>> search terms and created a smaller composite (8MB).
>>>
>>> The small composite works correctly, as far as I can tell. So
>>> instead of a large composite I shredded the all the individual files
>>> used to create the large composite to see if it makes any
>>> difference. It doesn't. The problem persists.
>>>
>>> When working with the small composite I noticed too that the query
>>> produces more [correct] results than when working with the large
>>> composite. I noticed too that the large composite returns incorrect
>>> results.
>>>
>>> For example, when searching for the phrase 'drug treatment',
>>> querying the large composite document returned hits containing
>>> 'drug' AND 'treatment' and only one hit containing the sought
>>> phrase. Searching the small composite returned 20 correct results
>>> for the sought phrase. (To clarify: the large and small composites
>>> contain the same documents.)
>>>
>>> I don't know if it's related, but I installed MonetDB4/XQuery on an
>>> Ubuntu box and I cannot shred the large composite into the database
>>> owing to a parsing error. This error, clearly, is not occurring
>>> under Windows. I will try shredding the small composite on the Linux
>>> box.
>>>
>>> (Incidentally, if files are added to the database in bulk using UNC
>>> pathnames, e.g., <doc path="\\nas\public\export\"
>>> name="myfile.xml"/> the files are shredded into the db but
>>> tijah:create-ft-index() thereafter fails with a shred error because
>>> of the pathname. I guess it's expecting Unix style pathnames.
>>> Although I question why the tijah indexer cares about the pathname
>>> since the documents are in the db. I will add this to the bug list.)
>>>
>>> -- Roy
>>>
>>> Henning Rode wrote:
>>>
>>>> sounds clearly like a bug. could you send me a short example
>>>> document that i can index and experiment with to find the bug?
>>>>
>>>> -henning
>>>>
>>>> Roy Walter wrote:
>>>>
>>>>> The following query:
>>>>>
>>>>> tijah:queryall("//p[about(., 'drug treatment')]")
>>>>>
>>>>> returns a number of results from my sample document. Some of these
>>>>> results contain the phrase "drug misuse". The following query:
>>>>>
>>>>> tijah:queryall("//p[about(., 'drug misuse')]")
>>>>>
>>>>> returns zero results from the sample document, which is clearly
>>>>> incorrect since some results returned by the first query should be
>>>>> returned by the second.
>>>>>
>>>>> I have deleted and reloaded the sample document and I have
>>>>> recreated the tijah index and the result is consistently
>>>>> incorrect. Is this a bug?
>>>>>
>>>>> -- Roy
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>>
>>>>> Let Crystal Reports handle the reporting - Free Crystal Reports
>>>>> 2008 30-Day trial. Simplify your report design, integration and
>>>>> deployment - and focus on what you do best, core application
>>>>> coding. Discover what's new with Crystal Reports now.
>>>>> http://p.sf.net/sfu/bobj-july
>>>>> _______________________________________________
>>>>> MonetDB-users mailing list
>>>>> MonetDB-users@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/monetdb-users
>>>>>
>>>> ------------------------------------------------------------------------
>>>>
>>>>
>>>>
>>>> No virus found in this incoming message.
>>>> Checked by AVG - www.avg.com Version: 8.5.409 / Virus Database:
>>>> 270.13.66/2325 - Release Date: 08/25/09 06:08:00
>>>>
>>>>
>>>
>>> ------------------------------------------------------------------------------
>>>
>>> Let Crystal Reports handle the reporting - Free Crystal Reports 2008
>>> 30-Day trial. Simplify your report design, integration and
>>> deployment - and focus on what you do best, core application coding.
>>> Discover what's new with Crystal Reports now.
>>> http://p.sf.net/sfu/bobj-july
>>> _______________________________________________
>>> MonetDB-users mailing list
>>> MonetDB-users@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/monetdb-users
>>> ------------------------------------------------------------------------
>>>
>>>
>>>
>>> No virus found in this incoming message.
>>> Checked by AVG - www.avg.com Version: 8.5.409 / Virus Database:
>>> 270.13.67/2326 - Release Date: 08/25/09 18:07:00
>>>
>>>
> ------------------------------------------------------------------------
>
>
> No virus found in this incoming message.
> Checked by AVG - www.avg.com
> Version: 8.5.409 / Virus Database: 270.13.69/2328 - Release Date: 08/26/09 12:16:00
>
>
------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now. http://p.sf.net/sfu/bobj-july
_______________________________________________
MonetDB-users mailing list
MonetDB-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/monetdb-users
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic