[prev in list] [next in list] [prev in thread] [next in thread]
List: xerces-j-dev
Subject: Re: Possible bug when parsing an XML document when JVM is using Turkish locale
From: "Ali Seaton" <ali.seaton () gmail ! com>
Date: 2008-06-24 19:55:20
Message-ID: 209139e50806241255t60092776q7f8ac5911e467207 () mail ! gmail ! com
[Download RAW message or body]
Hi Michael,
Thanks for the quick reply, I had downloaded the code for version 2.0.2 as
this is what was embedded within the application I was debugging. When I
searched on google I didn't find any reference to this problem so I never
though to check the latest version for the fix - doh!
Thanks all the same
Alistair
2008/6/24 Michael Glavassevich <mrglavas@ca.ibm.com>:
> Hi Alistair,
>
> You must be using an ancient version of Xerces. This particular problem was
> fixed way back in 2002.
>
> Try using the latest release (2.9.1) available here:
> http://xerces.apache.org/xerces2-j/download.cgi.
>
> Thanks.
>
> Michael Glavassevich
> XML Parser Development
> IBM Toronto Lab
> E-mail: mrglavas@ca.ibm.com
> E-mail: mrglavas@apache.org
>
> "Ali Seaton" <ali.seaton@gmail.com> wrote on 06/24/2008 12:33:55 PM:
>
>
> > Hi,
> >
> > I'm new to the list I just thought this issue should be raised. I
> > was receiving an error from Tomcat when is was parsing an XML document.
> >
> > The XML document was using a standard latin encoding 8859-1 declared
> > in this way:
> >
> > <?xml version="1.0" encoding="iso-8859-1"?>
> >
> > Error received:
> >
> > org.xml.sax.SAXParseException: Invalid encoding name "iso-8859-1".
> >
> > i knew iso-8859-1 was not an invalid encoding so I checked the code
> > to see what was going on. Inside the method org.apache.xerces.impl.
> > XMLEntityManager.createReader an upper case representation of the
> > encoding is created with a toUpperCase(). In a turkish locale a
> > small 'i' becomes a Turkish 'I' with a dot on it hence subsequent
> > checking of the encoding against the pre-defined valid lists fails.
> >
> > My suggestion would be that the toUpperCase should be called with
> > the overload that allows the specification of a english locale and
> > hence creating the correct 'I'
> >
> > Thanks
> >
> > Alistair
>
[Attachment #3 (text/html)]
Hi Michael,<br><br>Thanks for the quick reply, I had downloaded the code for version \
2.0.2 as this is what was embedded within the application I was debugging. When I \
searched on google I didn't find any reference to this problem so I never though \
to check the latest version for the fix - doh!<br> <br>Thanks all the \
same<br><br>Alistair<br><br><div class="gmail_quote">2008/6/24 Michael Glavassevich \
<<a href="mailto:mrglavas@ca.ibm.com">mrglavas@ca.ibm.com</a>>:<br><blockquote \
class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt \
0pt 0.8ex; padding-left: 1ex;"> <div>
<p><tt>Hi </tt><tt>Alistair</tt><tt>,</tt><br>
<br>
<tt>You must be using an ancient version of Xerces. This particular problem was fixed \
way back in 2002.</tt><br> <br>
<tt>Try using the latest release (2.9.1) available here: </tt><tt><a \
href="http://xerces.apache.org/xerces2-j/download.cgi" \
target="_blank">http://xerces.apache.org/xerces2-j/download.cgi</a>.</tt><br> <br>
<tt>Thanks.</tt><br>
<br>
<tt>Michael Glavassevich<br>
XML Parser Development<br>
IBM Toronto Lab<br>
E-mail: <a href="mailto:mrglavas@ca.ibm.com" \
target="_blank">mrglavas@ca.ibm.com</a></tt><br> <tt>E-mail: <a \
href="mailto:mrglavas@apache.org" target="_blank">mrglavas@apache.org</a></tt><br> \
<br> <tt>"Ali Seaton" <<a href="mailto:ali.seaton@gmail.com" \
target="_blank">ali.seaton@gmail.com</a>> wrote on 06/24/2008 12:33:55 \
PM:<div><div></div><div class="Wj3C7c"><br> <br>
> Hi,<br>
> <br>
> I'm new to the list I just thought this issue should be raised. I <br>
> was receiving an error from Tomcat when is was parsing an XML document.<br>
> <br>
> The XML document was using a standard latin encoding 8859-1 declared<br>
> in this way:<br>
> <br>
> <?xml version="1.0" encoding="iso-8859-1"?><br>
> <br>
> Error received:<br>
> <br>
> org.xml.sax.SAXParseException: Invalid encoding name "iso-8859-1".<br>
> <br>
> i knew iso-8859-1 was not an invalid encoding so I checked the code <br>
> to see what was going on. Inside the method org.apache.xerces.impl.<br>
> XMLEntityManager.createReader an upper case representation of the <br>
> encoding is created with a toUpperCase(). In a turkish locale a <br>
> small 'i' becomes a Turkish 'I' with a dot on it hence \
subsequent <br> > checking of the encoding against the pre-defined valid lists \
fails.<br> > <br>
> My suggestion would be that the toUpperCase should be called with <br>
> the overload that allows the specification of a english locale and <br>
> hence creating the correct 'I'<br>
> <br>
> Thanks<br>
> <br>
> Alistair</div></div></tt></p></div></blockquote></div><br>
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic