[prev in list] [next in list] [prev in thread] [next in thread] 

List:       solr-user
Subject:    Re: Couple issues with edismax in 3.5
From:       Way Cool <way1.waycool () gmail ! com>
Date:       2012-02-29 17:28:05
Message-ID: CANbytCjswye152dbR+G4Z64XwdzN4ZfAX_g01b5arRBBxyZZ1g () mail ! gmail ! com
[Download RAW message or body]


Thanks Ahmet for your reply.

I don't think mm will help here because it defaults to 100% already by the
following code.

 if (parsedUserQuery != null && doMinMatched) {
        String minShouldMatch = solrParams.get(DMP.MM, "100%");
        if (parsedUserQuery instanceof BooleanQuery) {
          U.setMinShouldMatch((BooleanQuery)parsedUserQuery,
minShouldMatch);
        }
      }

Regarding multi-word synonym, what is the best way to handle it now? Make
it as a phrase with " or adding -  in between?
I don't like index time expansion because it adds lots of noises.

That's good to know Analysis.jsp does not perform actual query parsing. I
was hoping edismax can do something similar to analysis tool because it
shows everything I need for multi-word synonym.

Thanks.

On Wed, Feb 29, 2012 at 1:23 AM, Ahmet Arslan <iorixxx@yahoo.com> wrote:

> > 1. Search for 4X6 generated the following parsed query:
> > +DisjunctionMaxQuery((((id:4 id:x id:6)^1.2) | ((name:4
> > name:x
> > name:6)^1.025) )
> > while the search for "4 X 6" (with space in between)
> > generated the query
> > below: (I like this one)
> > +((DisjunctionMaxQuery((id:4^1.2 | name:4^1.025)
> > +((DisjunctionMaxQuery((id:x^1.2 | name:x^1.025)
> > +((DisjunctionMaxQuery((id:6^1.2 | name:6^1.025)
> >
> > Is that really intentional? The first query is pretty weird
> > because it will
> > return all of the docs with one of 4, x, 6.
>
> Minimum Should Match (mm) parameter is used to control how many search
> terms should match. For example, you can set it to &mm=100%.
>
> Also you can tweak relevancy be setting phrase fields (pf) parameter.
>
> > Any easy way we can force "4X6" search to be the same as "4
> > X 6"?
> >
> > 2. Issue with multi words synonym because edismax separates
> > keywords to
> > multiple words via the line below:
> > clauses = splitIntoClauses(userQuery, false);
> > and seems like edismax doesn't quite respect fieldType at
> > query time, for
> > example, handling stopWords differently than what's
> > specified in schema.
> >
> > For example: I have the following synonym:
> > AAA BBB, AAABBB, AAA-BBB, CCC DDD
> >
> > When I search for "AAA-BBB", it works, however search for
> > "CCC DDD" was not
> > returning results containing AAABBB. What is interesting is
> > that
> > admin/analysis.jsp is returning great results.
>
> Query string is tokenized (according to white spaces) before it reaches
> analyzer. https://issues.apache.org/jira/browse/LUCENE-2605
> That's why multi-word synonyms are not advised to use at query time.
>
> Analysis.jsp does not perform actual query parsing.
>


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic