[prev in list] [next in list] [prev in thread] [next in thread] 

List:       xerces-j-dev
Subject:    Re: How to resolve UnknownHostException with Xerces-J API?
From:       Michael Glavassevich <mrglavas () ca ! ibm ! com>
Date:       2009-10-30 14:15:43
Message-ID: OF57354005.64AF3581-ON8525765F.004DA380-8525765F.004E595A () ca ! ibm ! com
[Download RAW message or body]

Yes. Using a resolver is generally a good idea, in particular one which
hooks up to an XML catalog [1]. The W3C gets hammered with requests for
DTDs [2] (and other entities) on its site every day, so on top of all the
other reasons why it's good to use catalogs, it will prevent your IP from
getting blacklisted for hitting their site one too many times for the same
resource.

Thanks.

[1] http://xerces.apache.org/xerces2-j/faq-xcatalogs.html
[2] http://www.w3.org/blog/systeam/2008/02/08/w3c_s_excessive_dtd_traffic

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org

Benson Margulies <bimargulies@gmail.com> wrote on 10/29/2009 10:00:40 PM:

> You may need to set a resolver and read these things in your own code.

> On Thu, Oct 29, 2009 at 8:24 PM, DeWayne <dewayne.c.dantzler@boeing.com
> > wrote:
>
> I'm trying to parse an xml file with entities like the one below. I'm
sitting
> behind a firewall as well and I'm using Xerces-j-2.9.1 to implement the
> SAXParser. I need the Parser to resolve the entity. When I run the app, I
> get the java exception java.net.UnknowHostException: www.w3.org. Code and
> command line args given below.
>
> entity requiring navigation of the internet to resolve it
> ===========================
> <?xml version="1.0" encoding="UTF-8"?>
> <!DOCTYPE mpd SYSTEM "mpboe03.dtd" [
> <!ENTITY % isobox PUBLIC "-//W3C//ENTITIES Box and Line Drawing//EN//XML"
> "http://www.w3.org/2003/entities/2007/isobox.ent" >
>  %isobox;
>
> code snippet to setup the parser
> ========================
>
> try {
>        parser =
> XMLReaderFactory.createXMLReader("org.apache.xerces.parsers.SAXParser");
>        parser.setErrorHandler(this);
>
>        // Turn on validation
>        parser.setFeature("http://xml.org/sax/features/validation", true);
>      ...}
>
> //invoke the parser along with the schema/dtd to use and the xml file to
> parse
> ================
> try {
>        parser.setProperty("http://java.sun.
> com/xml/jaxp/properties/schemaSource",
> schema.getAbsolutePath());
>        parser.parse(xmlfile2Parse.getAbsolutePath());
> }
>
> //command line options to navigate the firewall - specify the proxy and
port
> ======================================
> java -Dhttp.proxyHost=\"www-use-proxys.web.foo.com\"
> -Dhttp.proxyPort=\"98760\" -classpath .:xercesImpl.jar myApp
>
> --
> View this message in context: http://www.nabble.com/How-to-resolve-
> UnknownHostException-with-Xerces-J-API--tp26123150p26123150.html
> Sent from the Xerces - J - Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
> For additional commands, e-mail: j-users-help@xerces.apache.org
[Attachment #3 (text/html)]

<html><body>
<p><tt>Yes. Using a resolver is generally a good idea, in particular one which hooks \
up to an XML catalog [1]. The W3C gets hammered with requests for DTDs [2] (and other \
entities) on its site every day, so on top of all the other reasons why it's good to \
use catalogs, it will prevent your IP from getting blacklisted for hitting their site \
one too many times for the same resource.</tt><br> <br>
<tt>Thanks.</tt><br>
<br>
<tt>[1] <a href="http://xerces.apache.org/xerces2-j/faq-xcatalogs.html">http://xerces.apache.org/xerces2-j/faq-xcatalogs.html</a></tt><br>
 <tt>[2] <a href="http://www.w3.org/blog/systeam/2008/02/08/w3c_s_excessive_dtd_traffi \
c">http://www.w3.org/blog/systeam/2008/02/08/w3c_s_excessive_dtd_traffic</a></tt><br> \
<br> <tt>Michael Glavassevich<br>
XML Parser Development<br>
IBM Toronto Lab<br>
E-mail: mrglavas@ca.ibm.com</tt><br>
<tt>E-mail: mrglavas@apache.org</tt><br>
<br>
<tt>Benson Margulies &lt;bimargulies@gmail.com&gt; wrote on 10/29/2009 10:00:40 \
PM:<br> <br>
&gt; You may need to set a resolver and read these things in your own code.<br>
</tt><br>
<tt>&gt; On Thu, Oct 29, 2009 at 8:24 PM, DeWayne \
&lt;dewayne.c.dantzler@boeing.com<br> &gt; &gt; wrote:</tt><br>
<tt>&gt; <br>
&gt; I'm trying to parse an xml file with entities like the one below. I'm \
sitting<br> &gt; behind a firewall as well and I'm using Xerces-j-2.9.1 to implement \
the<br> &gt; SAXParser. I need the Parser to resolve the entity. When I run the app, \
I<br> &gt; get the java exception java.net.UnknowHostException: www.w3.org. Code \
and<br> &gt; command line args given below.<br>
&gt; <br>
&gt; entity requiring navigation of the internet to resolve it<br>
&gt; ===========================<br>
&gt; &lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;<br>
&gt; &lt;!DOCTYPE mpd SYSTEM &quot;mpboe03.dtd&quot; [<br>
&gt; &lt;!ENTITY % isobox PUBLIC &quot;-//W3C//ENTITIES Box and Line \
Drawing//EN//XML&quot;<br> &gt; &quot;<a \
href="http://www.w3.org/2003/entities/2007/isobox.ent">http://www.w3.org/2003/entities/2007/isobox.ent</a>&quot; \
&gt;<br> &gt;  %isobox;<br>
&gt; <br>
&gt; code snippet to setup the parser<br>
&gt; ========================<br>
&gt; <br>
&gt; try {<br>
&gt;        parser =<br>
&gt; XMLReaderFactory.createXMLReader(&quot;org.apache.xerces.parsers.SAXParser&quot;);<br>
 &gt;        parser.setErrorHandler(this);<br>
&gt; <br>
&gt;        // Turn on validation<br>
&gt;        parser.setFeature(&quot;<a \
href="http://xml.org/sax/features/validation">http://xml.org/sax/features/validation</a>&quot;, \
true);<br> &gt;      ...}<br>
&gt; <br>
&gt; //invoke the parser along with the schema/dtd to use and the xml file to<br>
&gt; parse<br>
&gt; ================<br>
&gt; try {<br>
&gt;        parser.setProperty(&quot;<a \
href="http://java.sun">http://java.sun</a>.<br> &gt; \
com/xml/jaxp/properties/schemaSource&quot;,<br> &gt; schema.getAbsolutePath());<br>
&gt;        parser.parse(xmlfile2Parse.getAbsolutePath());<br>
&gt; }<br>
&gt; <br>
&gt; //command line options to navigate the firewall - specify the proxy and port<br>
&gt; ======================================<br>
&gt; java -Dhttp.proxyHost=\&quot;www-use-proxys.web.foo.com\&quot;<br>
&gt; -Dhttp.proxyPort=\&quot;98760\&quot; -classpath .:xercesImpl.jar myApp<br>
&gt; <br>
&gt; --<br>
&gt; View this message in context: <a \
href="http://www.nabble.com/How-to-resolve-">http://www.nabble.com/How-to-resolve-</a><br>
 &gt; UnknownHostException-with-Xerces-J-API--tp26123150p26123150.html<br>
&gt; Sent from the Xerces - J - Users mailing list archive at Nabble.com.<br>
&gt; <br>
&gt; <br>
&gt; ---------------------------------------------------------------------<br>
&gt; To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org<br>
&gt; For additional commands, e-mail: j-users-help@xerces.apache.org<br>
</tt></body></html>



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic