[prev in list] [next in list] [prev in thread] [next in thread] 

List:       lucene-dev
Subject:    [jira] Created: (LUCENE-2731) HyphenationCompoundWordTokenFilter
From:       "Uwe Schindler (JIRA)" <jira () apache ! org>
Date:       2010-10-31 10:01:52
Message-ID: 10001301.161291288519312393.JavaMail.jira () thor
[Download RAW message or body]

HyphenationCompoundWordTokenFilter fails to load DTD in Crimson parser (JDK 1.4)
--------------------------------------------------------------------------------

                 Key: LUCENE-2731
                 URL: https://issues.apache.org/jira/browse/LUCENE-2731
             Project: Lucene - Java
          Issue Type: Bug
          Components: contrib/analyzers
            Reporter: Uwe Schindler
            Assignee: Uwe Schindler
             Fix For: 2.9.4


HyphenationCompoundWordTokenFilter loads the DTD in its XML parser from memory by \
supplying EntityResolver. In Java 1.4 (affects Lucene 2.9, but also later versions if \
not Apache Xerces is used as XML parser) this does not work, because Cromson does not \
even ask the entity resolver, if no base URI is known. As the hyphenation file is \
loaded from Reader/InputStream no base URI is known. Crimson needs at least a \
non-null systemId to proceed.

This patch (Lucene 2.9 only)  fakes this by supplying a fake systemId to the \
InputSource.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic