[prev in list] [next in list] [prev in thread] [next in thread] 

List:       lucene-dev
Subject:    [jira] Commented: (LUCENE-1166) A tokenfilter to decompose compound
From:       "Thomas Peuss (JIRA)" <jira () apache ! org>
Date:       2008-04-30 6:50:55
Message-ID: 1055271365.1209538255693.JavaMail.jira () brutus
[Download RAW message or body]


    [ https://issues.apache.org/jira/browse/LUCENE-1166?page=com.atlassian.jira.plugin \
.system.issuetabpanels:comment-tabpanel&focusedCommentId=12593238#action_12593238 ] 

Thomas Peuss commented on LUCENE-1166:
--------------------------------------

bq. So, why would I ever want to use a "Dumb" compound filter? Any suggestions for a \
better name? No need for a patch, I can just make the change.

A better name would be _DictionaryCompoundWordTokenFilter_. I called it "Dumb" \
because it uses a brute-force approach. But _DictionaryCompoundWordTokenFilter_ \
characterizes it better.

> A tokenfilter to decompose compound words
> -----------------------------------------
> 
> Key: LUCENE-1166
> URL: https://issues.apache.org/jira/browse/LUCENE-1166
> Project: Lucene - Java
> Issue Type: New Feature
> Components: Analysis
> Reporter: Thomas Peuss
> Assignee: Grant Ingersoll
> Priority: Minor
> Attachments: CompoundTokenFilter.patch, CompoundTokenFilter.patch, \
> CompoundTokenFilter.patch, CompoundTokenFilter.patch, CompoundTokenFilter.patch, \
> CompoundTokenFilter.patch, CompoundTokenFilter.patch, CompoundTokenFilter.patch, \
> de.xml, hyphenation.dtd 
> 
> A tokenfilter to decompose compound words you find in many germanic languages (like \
> German, Swedish, ...) into single tokens. An example: Donaudampfschiff would be \
> decomposed to Donau, dampf, schiff so that you can find the word even when you only \
> enter "Schiff". I use the hyphenation code from the Apache XML project FOP \
> (http://xmlgraphics.apache.org/fop/) to do the first step of decomposition. \
> Currently I use the FOP jars directly. I only use a handful of classes from the FOP \
> project. My question now:
> Would it be OK to copy this classes over to the Lucene project (renaming the \
> packages of course) or should I stick with the dependency to the FOP jars? The FOP \
> code uses the ASF V2 license as well. What do you think?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic