[prev in list] [next in list] [prev in thread] [next in thread]
List: xml-cocoon-users
Subject: Re: character encoding of a HttpServletRequest
From: Dominic Mitchell <dom () happygiraffe ! net>
Date: 2010-01-11 11:45:07
Message-ID: 45c308811001110345j6d4430f4l899a779379a946d2 () mail ! gmail ! com
[Download RAW message or body]
On Mon, Jan 11, 2010 at 10:34 AM, Jos Snellings <Jos.Snellings@pandora.be>wrote:
> That is right!
> It is just a confusing situation :-(
> The filter works fine. The init() method of a generator does not give a
> chance to call setCharacterEncoding, as the parsing already happened.
> The good thing is that the code is already in spring, so, no new
> external dependencies. Maybe later on I add a
> "tryToGuessEncodingFilter".
>
>
Trying to guess encodings isn't a good idea, in general. About the only one
that can be reliably detected is UTF-8. In past projects, I've done
something like this:
String result;
try {
result = new String(someBytes, "UTF-8");
catch (EncodingError e) {
result = new String(someBytes, "Windows-1252");
}
In my experience, Windows-1252 was a better guess than ISO-8859-1, as users
tend to paste in stuff from word documents with curly quotes.
-Dom
[Attachment #3 (text/html)]
On Mon, Jan 11, 2010 at 10:34 AM, Jos Snellings <span dir="ltr"><<a \
href="mailto:Jos.Snellings@pandora.be">Jos.Snellings@pandora.be</a>></span> \
wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" \
style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; \
padding-left: 1ex;"> That is right!<br>
It is just a confusing situation :-(<br>
The filter works fine. The init() method of a generator does not give a<br>
chance to call setCharacterEncoding, as the parsing already happened.<br>
The good thing is that the code is already in spring, so, no new<br>
external dependencies. Maybe later on I add a<br>
"tryToGuessEncodingFilter".<br>
<font color="#888888"><br></font></blockquote><div><br>Trying to guess encodings \
isn't a good idea, in general. About the only one that can be reliably detected \
is UTF-8. In past projects, I've done something like this:<br> <br> String \
result;<br> try {<br> result = new String(someBytes, "UTF-8");<br> \
catch (EncodingError e) {<br> result = new String(someBytes, \
"Windows-1252");<br> }<br><br>In my experience, Windows-1252 was a better \
guess than ISO-8859-1, as users tend to paste in stuff from word documents with curly \
quotes.<br> <br>-Dom <br></div></div><br>
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic