[prev in list] [next in list] [prev in thread] [next in thread] 

List:       solr-user
Subject:    Re: Filter factory to reduce word from plural forms to singular forms correctly?
From:       Binoy Dalal <binoydalal93 () gmail ! com>
Date:       2016-02-29 11:31:45
Message-ID: CALGoTy2vFX-eVZnYkXow2+zG65NsLezBe4bTujKWOKpg0MojcQ () mail ! gmail ! com
[Download RAW message or body]


A stemmer does not just reduce plurals to their singular forms. It is more
of a rule based approach to reduce a word to its root.

You should try a lemmatizer if you want a more dictionary based approach
although lemmatizers also mostly reduce a word to its root. If you want
only a plural to singular conversion, you might have to write some custom
code to do so.
On Mon, 29 Feb 2016, 15:12 Derek Poh, <dpoh@globalsources.com> wrote:

> Hi
>
> I am using EnglishMinimalStemFilterFactory to reducewords in plural
> forms to singular forms.
> The filter factory is not reducingthe plural formof 'es' to the singular
> form correctly. It is reducing correctly for plural form of 's'.
> "boxes" is reduced to "boxe" instead of "box"
> "glasses" to "glasse" instead of "glass" etc.
>
> I tried with PorterStemFilterFactory, itis able to reduce the plural
> 'es' formto singular form correctly. However itreduced "iphones" to
> "iphon" instead.
>
> Is there other filter factory that can reduce pluralto singular correctly?
>
> The field type definition of the field.
>      <fieldType class="solr.TextField" name="gs_keyword_exact"
> positionIncrementGap="100">
>          <analyzer type="index">
>              <tokenizer class="solr.KeywordTokenizerFactory" />
>              <filter class="solr.LowerCaseFilterFactory" />
>              <filter class="solr.EnglishMinimalStemFilterFactory" />
>          </analyzer>
>          <analyzer type="query">
>              <tokenizer class="solr.KeywordTokenizerFactory" />
>              <filter class="solr.LowerCaseFilterFactory" />
>              <filter class="solr.EnglishMinimalStemFilterFactory" />
>          </analyzer>
>      </fieldType>
>
> ----------------------
> CONFIDENTIALITY NOTICE
>
> This e-mail (including any attachments) may contain confidential and/or
> privileged information. If you are not the intended recipient or have
> received this e-mail in error, please inform the sender immediately and
> delete this e-mail (including any attachments) from your computer, and you
> must not use, disclose to anyone else or copy this e-mail (including any
> attachments), whether in whole or in part.
>
> This e-mail and any reply to it may be monitored for security, legal,
> regulatory compliance and/or other appropriate reasons.

-- 
Regards,
Binoy Dalal


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic