[prev in list] [next in list] [prev in thread] [next in thread] 

List:       xerces-j-dev
Subject:    Re: Possible bug when parsing an XML document when JVM is using Turkish locale
From:       "Ali Seaton" <ali.seaton () gmail ! com>
Date:       2008-06-24 19:55:20
Message-ID: 209139e50806241255t60092776q7f8ac5911e467207 () mail ! gmail ! com
[Download RAW message or body]

Hi Michael,

Thanks for the quick reply, I had downloaded the code for version 2.0.2 as
this is what was embedded within the application I was debugging. When I
searched on google I didn't find any reference to this problem so I never
though to check the latest version for the fix - doh!

Thanks all the same

Alistair

2008/6/24 Michael Glavassevich <mrglavas@ca.ibm.com>:

> Hi Alistair,
>
> You must be using an ancient version of Xerces. This particular problem was
> fixed way back in 2002.
>
> Try using the latest release (2.9.1) available here:
> http://xerces.apache.org/xerces2-j/download.cgi.
>
> Thanks.
>
> Michael Glavassevich
> XML Parser Development
> IBM Toronto Lab
> E-mail: mrglavas@ca.ibm.com
> E-mail: mrglavas@apache.org
>
> "Ali Seaton" <ali.seaton@gmail.com> wrote on 06/24/2008 12:33:55 PM:
>
>
> > Hi,
> >
> > I'm new to the list I just thought this issue should be raised. I
> > was receiving an error from Tomcat when is was parsing an XML document.
> >
> > The XML document was using a standard latin encoding 8859-1 declared
> > in this way:
> >
> > <?xml version="1.0" encoding="iso-8859-1"?>
> >
> > Error received:
> >
> > org.xml.sax.SAXParseException: Invalid encoding name "iso-8859-1".
> >
> > i knew iso-8859-1 was not an invalid encoding so I checked the code
> > to see what was going on. Inside the method org.apache.xerces.impl.
> > XMLEntityManager.createReader an upper case representation of the
> > encoding is created with a toUpperCase(). In a turkish locale a
> > small 'i' becomes a Turkish 'I' with a dot on it hence subsequent
> > checking of the encoding against the pre-defined valid lists fails.
> >
> > My suggestion would be that the toUpperCase should be called with
> > the overload that allows the specification of a english locale and
> > hence creating the correct 'I'
> >
> > Thanks
> >
> > Alistair
>

[Attachment #3 (text/html)]

Hi Michael,<br><br>Thanks for the quick reply, I had downloaded the code for version \
2.0.2 as this is what was embedded within the application I was debugging. When I \
searched on google I didn&#39;t find any reference to this problem so I never though \
to check the latest version for the fix - doh!<br> <br>Thanks all the \
same<br><br>Alistair<br><br><div class="gmail_quote">2008/6/24 Michael Glavassevich \
&lt;<a href="mailto:mrglavas@ca.ibm.com">mrglavas@ca.ibm.com</a>&gt;:<br><blockquote \
class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt \
0pt 0.8ex; padding-left: 1ex;"> <div>
<p><tt>Hi </tt><tt>Alistair</tt><tt>,</tt><br>
<br>
<tt>You must be using an ancient version of Xerces. This particular problem was fixed \
way back in 2002.</tt><br> <br>
<tt>Try using the latest release (2.9.1) available here: </tt><tt><a \
href="http://xerces.apache.org/xerces2-j/download.cgi" \
target="_blank">http://xerces.apache.org/xerces2-j/download.cgi</a>.</tt><br> <br>
<tt>Thanks.</tt><br>
<br>
<tt>Michael Glavassevich<br>
XML Parser Development<br>
IBM Toronto Lab<br>
E-mail: <a href="mailto:mrglavas@ca.ibm.com" \
target="_blank">mrglavas@ca.ibm.com</a></tt><br> <tt>E-mail: <a \
href="mailto:mrglavas@apache.org" target="_blank">mrglavas@apache.org</a></tt><br> \
<br> <tt>&quot;Ali Seaton&quot; &lt;<a href="mailto:ali.seaton@gmail.com" \
target="_blank">ali.seaton@gmail.com</a>&gt; wrote on 06/24/2008 12:33:55 \
PM:<div><div></div><div class="Wj3C7c"><br> <br>
&gt; Hi,<br>
&gt; <br>
&gt; I&#39;m new to the list I just thought this issue should be raised. I <br>
&gt; was receiving an error from Tomcat when is was parsing an XML document.<br>
&gt; <br>
&gt; The XML document was using a standard latin encoding 8859-1 declared<br>
&gt; in this way:<br>
&gt; <br>
&gt; &lt;?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot;?&gt;<br>
&gt; <br>
&gt; Error received:<br>
&gt; <br>
&gt; org.xml.sax.SAXParseException: Invalid encoding name &quot;iso-8859-1&quot;.<br>
&gt; <br>
&gt; i knew iso-8859-1 was not an invalid encoding so I checked the code <br>
&gt; to see what was going on. Inside the method org.apache.xerces.impl.<br>
&gt; XMLEntityManager.createReader an upper case representation of the <br>
&gt; encoding is created with a toUpperCase(). In a turkish locale a <br>
&gt; small &#39;i&#39; becomes a Turkish &#39;I&#39; with a dot on it hence \
subsequent <br> &gt; checking of the encoding against the pre-defined valid lists \
fails.<br> &gt; <br>
&gt; My suggestion would be that the toUpperCase should be called with <br>
&gt; the overload that allows the specification of a english locale and <br>
&gt; hence creating the correct &#39;I&#39;<br>
&gt; <br>
&gt; Thanks<br>
&gt; <br>
&gt; Alistair</div></div></tt></p></div></blockquote></div><br>



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic