[prev in list] [next in list] [prev in thread] [next in thread]
List: lucene-dev
Subject: [jira] Issue Comment Edited: (LUCENE-2943) ICU collator
From: "Uwe Schindler (JIRA)" <jira () apache ! org>
Date: 2011-02-28 21:08:37
Message-ID: 1292166866.2862.1298927317541.JavaMail.tomcat () hel ! zones ! apache ! org
[Download RAW message or body]
[ https://issues.apache.org/jira/browse/LUCENE-2943?page=com.atlassian.jira.plugin \
.system.issuetabpanels:comment-tabpanel&focusedCommentId=13000511#comment-13000511 ]
Uwe Schindler edited comment on LUCENE-2943 at 2/28/11 9:07 PM:
----------------------------------------------------------------
I changed my mind a little bit:
The cloning of the Collator should be done in the Analyzer not in the Filter. The \
same applies to the AttributeImpl, the cloning should not be done in the ctor. The \
problem is not that the TokenStream or the Attribute instance may reuse the attribute \
in different threads, the problem is that the factory class (the Analyzer) does reuse \
the Collator in different threads when it produces multiple tokenstreams or the AF \
multiple attributes.
This is a slight difference, because the following code is always safe:
new CollationFilter(Collator.newInstance(lang)), cloning would be wrong.
The reason for the whole thing: TokenStream and Attribute instances itsself are \
single-threaded only, but not the factory or the analyzer.
was (Author: thetaphi):
I changed my mind a little bit:
The cloning of the Filter should be done in the Analyzer not in the Filter. The same \
applies to the AttributeImpl, the cloning should be done in the ctor. The problem is \
not that the TokenStream or the Attribute instance may reuse the attribute in \
different threads, the problem is that the factory class (the Analyzer) does reuse \
the Collator in different threads when it produces multiple tokenstreams or the AF \
multiple attributes.
This is a slight difference, because the following code is always safe:
new CollationFilter(Collator.newInstance(lang)), cloning would be wrong.
The reason for the whole thing: TokenStream and Attribute instances itsself are \
single-threaded only, but not the factory or the analyzer.
> ICU collator thread-safety issues
> ---------------------------------
>
> Key: LUCENE-2943
> URL: https://issues.apache.org/jira/browse/LUCENE-2943
> Project: Lucene - Java
> Issue Type: Bug
> Components: Analysis
> Reporter: Robert Muir
> Fix For: 3.1, 4.0
>
> Attachments: LUCENE-2943.patch
>
>
> The ICU Collators (unlike the JDK ones) aren't thread safe: \
> http://userguide.icu-project.org/collation/architecture , a little non-obvious \
> since its not mentioned in the javadocs, and its not clear if the docs apply to \
> only the C code, but i looked at the source and there is all kinds of internal \
> state. So in my opinion, we should clone the icu collators (which are passed in \
> from the outside) when creating a new TokenStream/AttributeImpl to prevent \
> problems. This shouldn't be a big deal since everything uses reusableTokenStream \
> anyway.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic