[prev in list] [next in list] [prev in thread] [next in thread] 

List:       monetdb-users
Subject:    Re: [MonetDB-users] tijah phrase search problem
From:       Roy Walter <garliestonhouse () yahoo ! co ! uk>
Date:       2009-08-27 17:06:30
Message-ID: 4A96BD16.1090409 () yahoo ! co ! uk
[Download RAW message or body]

Hi Henning

I didn't compile MonetDB. On my Linux box I used the Ubuntu package and 
on my 64-bit Windows box I installed binaries.

Just to add something more to the mix, I reverted to a standard [non 
tijah] contains() query on my Windows box, i.e.,

    collection("drugtest")//p[contains(., "drug misuse")]

I ran this query on the 532 document collection from within mclient and 
the only output to the console was "write error"! If, however, I run the 
same query via JDBC I get an appropriate resultset. So here too the 
embedded query runs without any problems. The same query on the Linux 
box runs and prints OK.

There seems to be a matrix of problems here. My main concern at the 
moment is to get phrase searching working correctly from within tijah 
queries, preferably under Windows as I bought a new 64-bit box for the 
purpose. Although contains() queries perform well on my 64-bit Windows 
box, tijah phrase queries have distinct advantages as you know.

-- Roy

Henning Rode wrote:
> hej Roy,
>
> thanks for investigating the problem so thoroughly. i have encountered 
> similar errors in the path months. so, before we continue digging for 
> the cause of the trouble, i have one question. how did you configure 
> MonetDB when you compile it? in my case this error pops up most often 
> when i use this settings:
> --enable-bits=64 --enable-oid32
> otherwise i hardly get problems like that.
>
> the strange thing with this bug is, that the embedded tijah-query 
> typically runs without any problem, but you only get the error when 
> the xquery execution comes to the final printing of results. is that, 
> what you observe here as well?
>
> -henning
>
>
> Roy Walter wrote:
>> As a further experiment I shredded the individual documents used to 
>> make the large composite into the db on the Linux box. The shredding 
>> completed without error, adding 532 documents to the database. After 
>> creating the tijah index I ran the following query:
>>
>>     tijah:queryall("//p[about(., 'drug misuse')]")
>>
>> and it returned a number of correct results. (The same sequence, 
>> i.e., shredding->indexing->querying, on a Windows box produced no 
>> results.)
>>
>> Not all results were printed to the console, however, as the query 
>> produced the following error:
>>  
>>     !ERROR: XML Generation: tmpr_1231 BAT does not have a 120 head.
>>     ...
>>     ERROR = !ERROR:
>>     !ERROR: xquery_print_result_main: operation failed.
>>
>> It's possible that the error is memory related as my Jaunty 
>> installation is running under VirtualBox.
>>
>> -- Roy
>>
>> Roy Walter wrote:
>>> It's a strange one. I have been experimenting a little.
>>>
>>> I was working with a single large [composite] document (176MB) that 
>>> showed the problem. So I took a subset of files containing the 
>>> search terms and created a smaller composite (8MB).
>>>
>>> The small composite works correctly, as far as I can tell. So 
>>> instead of a large composite I shredded the all the individual files 
>>> used to create the large composite to see if it makes any 
>>> difference. It doesn't. The problem persists.
>>>
>>> When working with the small composite I noticed too that the query 
>>> produces more [correct] results than when working with the large 
>>> composite. I noticed too that the large composite returns incorrect 
>>> results.
>>>
>>> For example, when searching for the phrase 'drug treatment', 
>>> querying the large composite document returned hits containing 
>>> 'drug' AND 'treatment' and only one hit containing the sought 
>>> phrase. Searching the small composite returned 20 correct results 
>>> for the sought phrase. (To clarify: the large and small composites 
>>> contain the same documents.)
>>>
>>> I don't know if it's related, but I installed MonetDB4/XQuery on an 
>>> Ubuntu box and I cannot shred the large composite into the database 
>>> owing to a parsing error. This error, clearly, is not occurring 
>>> under Windows. I will try shredding the small composite on the Linux 
>>> box.
>>>
>>> (Incidentally, if files are added to the database in bulk using UNC 
>>> pathnames, e.g., <doc path="\\nas\public\export\" 
>>> name="myfile.xml"/> the files are shredded into the db but 
>>> tijah:create-ft-index() thereafter fails with a shred error because 
>>> of the pathname. I guess it's expecting Unix style pathnames. 
>>> Although I question why the tijah indexer cares about the pathname 
>>> since the documents are in the db. I will add this to the bug list.)
>>>
>>> -- Roy
>>>
>>> Henning Rode wrote:
>>>  
>>>> sounds clearly like a bug. could you send me a short example 
>>>> document that i can index and experiment with to find the bug?
>>>>
>>>> -henning
>>>>
>>>> Roy Walter wrote:
>>>>    
>>>>> The following query:
>>>>>
>>>>>     tijah:queryall("//p[about(., 'drug treatment')]")
>>>>>
>>>>> returns a number of results from my sample document. Some of these 
>>>>> results contain the phrase "drug misuse". The following query:
>>>>>
>>>>>     tijah:queryall("//p[about(., 'drug misuse')]")
>>>>>
>>>>> returns zero results from the sample document, which is clearly 
>>>>> incorrect since some results returned by the first query should be 
>>>>> returned by the second.
>>>>>
>>>>> I have deleted and reloaded the sample document and I have 
>>>>> recreated the tijah index and the result is consistently 
>>>>> incorrect. Is this a bug?
>>>>>
>>>>> -- Roy
>>>>>
>>>>> ------------------------------------------------------------------------------ 
>>>>>
>>>>> Let Crystal Reports handle the reporting - Free Crystal Reports 
>>>>> 2008 30-Day trial. Simplify your report design, integration and 
>>>>> deployment - and focus on what you do best, core application 
>>>>> coding. Discover what's new with Crystal Reports now.  
>>>>> http://p.sf.net/sfu/bobj-july
>>>>> _______________________________________________
>>>>> MonetDB-users mailing list
>>>>> MonetDB-users@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/monetdb-users
>>>>>         
>>>> ------------------------------------------------------------------------ 
>>>>
>>>>
>>>>
>>>> No virus found in this incoming message.
>>>> Checked by AVG - www.avg.com Version: 8.5.409 / Virus Database: 
>>>> 270.13.66/2325 - Release Date: 08/25/09 06:08:00
>>>>
>>>>       
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> Let Crystal Reports handle the reporting - Free Crystal Reports 2008 
>>> 30-Day trial. Simplify your report design, integration and 
>>> deployment - and focus on what you do best, core application coding. 
>>> Discover what's new with Crystal Reports now.  
>>> http://p.sf.net/sfu/bobj-july
>>> _______________________________________________
>>> MonetDB-users mailing list
>>> MonetDB-users@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/monetdb-users
>>> ------------------------------------------------------------------------ 
>>>
>>>
>>>
>>> No virus found in this incoming message.
>>> Checked by AVG - www.avg.com Version: 8.5.409 / Virus Database: 
>>> 270.13.67/2326 - Release Date: 08/25/09 18:07:00
>>>
>>>   
> ------------------------------------------------------------------------
>
>
> No virus found in this incoming message.
> Checked by AVG - www.avg.com 
> Version: 8.5.409 / Virus Database: 270.13.69/2328 - Release Date: 08/26/09 12:16:00
>
>   

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
MonetDB-users mailing list
MonetDB-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/monetdb-users
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic