[prev in list] [next in list] [prev in thread] [next in thread]
List: lucene-user
Subject: Re: QueryParser refactoring
From: Morus Walter <morus.walter () tanto ! de>
Date: 2005-03-08 14:14:37
Message-ID: 16941.45901.672724.728010 () tanto-xipolis ! de
[Download RAW message or body]
Erik Hatcher writes:
>
> On Mar 8, 2005, at 4:38 AM, Morus Walter wrote:
> >> I created a modified Query->String converter for my current day time
> >> project (as I use a String representation for the most recently used
> >> drop-down that is stored as a client-side cookie) that explicitly puts
> >> in "OR" between SHOULD BooleanClauses.
> >>
> > You cannot do that in the general case. BooleanQuery allows for queries
> > that cannot be expressed only using AND/OR/NOT.
> > E.g. `a b +c'. That's not `(a OR b) AND c' nor any other boolean
> > expression.
> > It's also not expressable in QP syntax for default operator AND.
>
> I just added this bit of code to my QueryConverterTest:
>
> BooleanQuery bq = new BooleanQuery();
> bq.add(new TermQuery(new Term("field", "a")),
> BooleanClause.Occur.SHOULD);
> bq.add(new TermQuery(new Term("field", "b")),
> BooleanClause.Occur.SHOULD);
> bq.add(new TermQuery(new Term("field", "c")),
> BooleanClause.Occur.MUST);
>
> assertEquals("a OR b +c", QueryConverter.convert(bq, "field"));
>
> System.out.println(QueryParser.parse(QueryConverter.convert(bq,
> "field"), "field", new SimpleAnalyzer()));
>
> The output is:
> a OR b +c
>
> And the test passes and the query is expressible in QP syntax for
> AND.... unless I'm missing something obvious here.
>
Ok, you're right, if you mix boolean queries and flags you can do that.
I don't like that because it's hard to explain.
And if you understand that as a OR b AND c, what it is looking like,
you're lost.
> >> Silently drop as in you removed them entirely from the resultant
> >> Query?
> >>
> > Right. `a AND (NOT b)' parses to `a'
>
> Is this what we want to happen for a general purpose next generation
> Lucene QueryParser though? I'm not sure. Perhaps this should be a
> ParseException instead?
>
> >> That'd be easy enough to add - but is that what we want to happen?
> >> Community, thoughts?
> >>
> > Throwing an exception is presumably the other alternative.
> > Could that check be done in an overwritable method?
>
> Good point. Sure.
>
That's ok then. Throw a ParseException and whoever doesn't like that,
can overwrite the method and either keep the query (knowing that it will
be ignored in search anyway) or drop it.
> >>> In an application, I handled this by dropping the query and notifying
> >>> the
> >>> user, that some part of the query could not be handled and was
> >>> ignored.
> >>
> >> How did your application notice that part of the query was dropped?
> >>
> > QueryParser told it ;-)
> > I used a modified query parser that was provided with two StringBuffers
> > and QP filled one with droped stop words (something to report to the
> > user
> > as well) and the other with droped subqueries.
>
> Perhaps QueryParser should have some additional hooks allowing you to
> either subclass and tap into things or pass in some sort of custom
> listener hook?
>
> I'm interested in how you trapped the stop words that were removed -
> did you use a custom analyzer that gave you this information? Or some
> other technique?
>
Right. That isn't obvious. Actually I had to look into my code (written
about a year ago).
I filtered stop words in the query parser itself. That is, the analyzer
didn't use any stop words but the QP had a stop word map and checked
the analyzer output.
It's more complicated than simply using the same analyzer as indexing did,
but I think it's relevant to users, to know what happend to their query.
> > In that case QP would keep the information in instance variables and
> > provide
> > getter methods.
> > Having a list of recognized tokens could be helpful as well (e.g. to
> > create
> > spelling suggestions).
>
> We're getting vastly more sophisticated than just correcting the order
> of precedence. I can continue to fiddle with this over time, but I
> won't be able to dedicate a solid effort to all of these. Feel free to
> take contribute patches that enhance it. Let me know if my
> PrecedenceQueryParser needs tweaks to be a usable base or if starting
> over with QueryParser is better.
>
Ok, I'll see how much time I can find for this.
Morus
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic