[prev in list] [next in list] [prev in thread] [next in thread]
List: lucene-user
Subject: Re: QueryParser refactoring
From: Erik Hatcher <erik () ehatchersolutions ! com>
Date: 2005-03-07 21:36:19
Message-ID: 9ee8730d005921ac5d712bef8d2af0b1 () ehatchersolutions ! com
[Download RAW message or body]
Morus - thanks for your quick checks. More below....
On Mar 7, 2005, at 3:32 PM, Morus Walter wrote:
> I had a quick look at the new QP. I didn't look at the code yet, but I
> redid my patch at the weekend for the current code, and I found it
> quite
> ugly ;-) I didn't finish it completely, so I didn't upload it to
> bugzilla yet.
>
> Your changes look great in general, though I find some issues:
>
> 1) 'stop OR stop AND stop' where stop is a stopword gives a parse
> error:
> Encountered "<EOF>" at line 1, column 0.
> Was expecting one of:
> <NOT> ...
> ...
I think you must have tried this in a transient state when I forgot to
check in some JavaCC generated files. Try again. This one now returns
an empty BooleanQuery.
> 2) Single term queries using +/- flags are parse to a query without
> flag
> +a -> a
Hmmm.... this is a debatable one. It's returning a TermQuery in this
case for "a". Is that appropriate? Or should it return a BooleanQuery
with a single TermQuery as required?
I think having it optimized to a TermQuery makes the most sense.
Though, putting it in a BooleanQuery does make this next one simpler...
> -a -> a
> While this doesn't make a difference for +a it's a bit strange for -a,
> OTOH -a isn't a usable query anyway.
Oops... yeah, you're right. If its a single clause right now it
doesn't wrap in a BooleanQuery and thus does not take into account the
modifier +/-/NOT. But as you say, this is a bogus query anyway. I
guess the right thing to do is wrap both the +a query as above and the
-a query into a BooleanQuery with the modifier set appropriately.
> 3) a OR NOT b parses to 'a -b' which is the same as 'a AND NOT b'
> IMHO `a OR NOT b' should be `a OR (NOT b)' though lucene cannot
> search
> that. Maybe it should raise an error...
Actually it parses like this:
a OR NOT b -> a -b
a AND NOT b -> +a -b
So they are slightly different, though the effect will be the same.
> a OR NOT b AND c (parsed to a -(+b +c)) should IMHO be parsed to `a
> (-b +c)'
Ah, ok.... so NOT gets much higher precedence than I'm currently giving
it. That might take me a while to achieve, but I'll give it a shot.
Erik
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic