[prev in list] [next in list] [prev in thread] [next in thread]
List: solr-user
Subject: Re: OPENNLP problems
From: Lance Norskog <goksron () gmail ! com>
Date: 2013-05-30 22:47:32
Message-ID: 51A7D704.4020107 () gmail ! com
[Download RAW message or body]
I will look at these problems. Thanks for trying it out!
Lance Norskog
On 05/28/2013 10:08 PM, Patrick Mi wrote:
> Hi there,
>
> Checked out branch_4x and applied the latest patch
> LUCENE-2899-current.patch however I ran into 2 problems
>
> Followed the wiki page instruction and set up a field with this type aiming
> to keep nouns and verbs and do a facet on the field
> ==
> <fieldType name="text_opennlp_nvf" class="solr.TextField"
> positionIncrementGap="100">
> <analyzer>
> <tokenizer class="solr.OpenNLPTokenizerFactory"
> tokenizerModel="opennlp/en-token.bin"/>
> <filter class="solr.OpenNLPFilterFactory"
> posTaggerModel="opennlp/en-pos-maxent.bin"/>
> <filter class="solr.FilterPayloadsFilterFactory"
> payloadList="NN,NNS,NNP,NNPS,VB,VBD,VBG,VBN,VBP,VBZ,FW"/>
> <filter class="solr.StripPayloadsFilterFactory"/>
> </analyzer>
> </fieldType>
> ==
>
> Struggled to get that going until I put the extra parameter
> keepPayloads="true" in as below.
> <filter class="solr.FilterPayloadsFilterFactory" keepPayloads="true"
> payloadList="NN,NNS,NNP,NNPS,VB,VBD,VBG,VBN,VBP,VBZ,FW"/>
>
> Question: am I doing the right thing? Is this a mistake on wiki
>
> Second problem:
>
> Posted the document xml one by one to the solr and the result was what I
> expected.
>
> <add>
> <doc>
> <field name="id">1</field>
> <field name="text_opennlp_nvf">check in the hotel</field></doc>
> </add>
>
> However if I put multiple documents into the same xml file and post it in
> one go only the first document gets processed( only 'check' and 'hotel' were
> showing in the facet result.)
>
> <add>
> <doc>
> <field name="id">1</field>
> <field name="text_opennlp_nvf">check in the hotel</field>
> </doc>
> <doc>
> <field name="id">2</field>
> <field name="text_opennlp_nvf">removes the payloads</field>
> </doc>
> <doc>
> <field name="id">3</field>
> <field name="text_opennlp_nvf">retains only nouns and verbs </field>
> </doc>
> </add>
>
> Same problem when updated the data using csv upload.
>
> Is that a bug or something I did wrong?
>
> Thanks in advance!
>
> Regards,
> Patrick
>
>
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic