[prev in list] [next in list] [prev in thread] [next in thread]
List: xerces-j-dev
Subject: Re: Way it ignore entity reference resolving?
From: Michael Glavassevich <mrglavas () ca ! ibm ! com>
Date: 2012-03-30 22:47:13
Message-ID: OF03C58045.04B31454-ON852579D1.007BF77E-852579D1.007D4410 () ca ! ibm ! com
[Download RAW message or body]
--=_alternative 007D4410852579D1_=
Content-Type: text/plain; charset="US-ASCII"
HTML != XML. Try an HTML parser like NekoHTML [1].
Please note that you're not using Apache Xerces at all.
com.sun.org.apache.* is Oracle's fork of the codebase. We have no
influence over it.
Thanks.
[1] http://nekohtml.sourceforge.net/
Michael Glavassevich
XML Technologies and WAS Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org
laredotornado <laredotornado@gmail.com> wrote on 30/03/2012 05:16:13 PM:
> Hi,
>
> I'm using Java 6 and the latest version of Xerces. I'm trying to parse
an
> HTML document that begins like this ...
>
> <!DOCTYPE html>
>
> and later references the entity "»". Parsing dies with the
exception
> ...
>
> org.xml.sax.SAXParseException: The entity "raquo" was referenced, but
not
> declared.
> at
>
com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:249)
> at
> com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse
> (DocumentBuilderImpl.java:284)
> at
> com.myco.myproject.util.XmlUtilities.getStringAsDocument
> (XmlUtilities.java:147)
> at
>
com.myco.myproject.util.NetUtilities.getUrlAsDocument(NetUtilities.java:65)
> at
> com.myco.myproject.parsers.impl.AbstractMetromixParser.parsePage
> (AbstractMetromixParser.java:107)
> at
> com.myco.myproject.parsers.impl.AbstractMetromixParser.getEvents
> (AbstractMetromixParser.java:76)
> at com.myco.myproject.domain.EventFeed.refresh(EventFeed.java:81)
> at com.myco.myproject.domain.EventFeed.getEvents(EventFeed.java:72)
> at
> com.myco.myproject.parsers.impl.MetromixParserTest.testParser
> (MetromixParserTest.java:21)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
>
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke
> (DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall
> (FrameworkMethod.java:44)
> at
> org.junit.internal.runners.model.ReflectiveCallable.run
> (ReflectiveCallable.java:15)
> at
> org.junit.runners.model.FrameworkMethod.invokeExplosively
> (FrameworkMethod.java:41)
> at
> org.junit.internal.runners.statements.InvokeMethod.evaluate
> (InvokeMethod.java:20)
> at
>
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
> at
>
org.springframework.test.context.junit4.statements.RunBeforeTestMethodCallbacks.evaluate
> (RunBeforeTestMethodCallbacks.java:74)
> at
>
org.springframework.test.context.junit4.statements.RunAfterTestMethodCallbacks.evaluate
> (RunAfterTestMethodCallbacks.java:83)
> at
> org.springframework.test.context.junit4.statements.SpringRepeat.evaluate
> (SpringRepeat.java:72)
> at
> org.springframework.test.context.junit4.SpringJUnit4ClassRunner.runChild
> (SpringJUnit4ClassRunner.java:231)
> at
> org.junit.runners.BlockJUnit4ClassRunner.runChild
> (BlockJUnit4ClassRunner.java:50)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
> at
>
org.springframework.test.context.junit4.statements.RunBeforeTestClassCallbacks.evaluate
> (RunBeforeTestClassCallbacks.java:61)
> at
>
org.springframework.test.context.junit4.statements.RunAfterTestClassCallbacks.evaluate
> (RunAfterTestClassCallbacks.java:71)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
> at
> org.springframework.test.context.junit4.SpringJUnit4ClassRunner.run
> (SpringJUnit4ClassRunner.java:174)
> at
> org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run
> (JUnit4TestReference.java:50)
> at
>
org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
> at
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests
> (RemoteTestRunner.java:467)
> at
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests
> (RemoteTestRunner.java:683)
> at
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run
> (RemoteTestRunner.java:390)
> at
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main
> (RemoteTestRunner.java:197)
>
> Is there any way to tell the parser to ignore these types of entities it
> cannot resolve? If not, what resolver do I have to plugin?
>
> Thanks, - Dave
> --
> View this message in context: http://old.nabble.com/Way-it-ignore-
> entity-reference-resolving--tp33544935p33544935.html
> Sent from the Xerces - J - Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
> For additional commands, e-mail: j-users-help@xerces.apache.org
--=_alternative 007D4410852579D1_=
Content-Type: text/html; charset="US-ASCII"
<tt><font size=2>HTML != XML. Try an HTML parser like NekoHTML [1].</font></tt>
<br>
<br><tt><font size=2>Please note that you're not using Apache Xerces at
all. com.sun.org.apache.* is Oracle's fork of the codebase. We have no
influence over it.</font></tt>
<br>
<br><tt><font size=2>Thanks.</font></tt>
<br>
<br><tt><font size=2>[1] </font></tt><a \
href=http://nekohtml.sourceforge.net/><tt><font \
size=2>http://nekohtml.sourceforge.net/</font></tt></a> <br>
<br><tt><font size=2>Michael Glavassevich<br>
XML Technologies and WAS Development<br>
IBM Toronto Lab<br>
E-mail: mrglavas@ca.ibm.com</font></tt>
<br><tt><font size=2>E-mail: mrglavas@apache.org</font></tt>
<br>
<br><tt><font size=2>laredotornado <laredotornado@gmail.com> wrote
on 30/03/2012 05:16:13 PM:<br>
<br>
> Hi,<br>
> <br>
> I'm using Java 6 and the latest version of Xerces. I'm trying
to parse an<br>
> HTML document that begins like this ...<br>
> <br>
> <!DOCTYPE html><br>
> <br>
> and later references the entity "&raquo;". Parsing
dies with the exception<br>
> ...<br>
> <br>
> org.xml.sax.SAXParseException: The entity "raquo" was referenced,
but not<br>
> declared.<br>
> at<br>
> com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:249)<br>
> at<br>
> com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse<br>
> (DocumentBuilderImpl.java:284)<br>
> at<br>
> com.myco.myproject.util.XmlUtilities.getStringAsDocument<br>
> (XmlUtilities.java:147)<br>
> at<br>
> com.myco.myproject.util.NetUtilities.getUrlAsDocument(NetUtilities.java:65)<br>
> at<br>
> com.myco.myproject.parsers.impl.AbstractMetromixParser.parsePage<br>
> (AbstractMetromixParser.java:107)<br>
> at<br>
> com.myco.myproject.parsers.impl.AbstractMetromixParser.getEvents<br>
> (AbstractMetromixParser.java:76)<br>
> at \
com.myco.myproject.domain.EventFeed.refresh(EventFeed.java:81)<br> > \
at com.myco.myproject.domain.EventFeed.getEvents(EventFeed.java:72)<br> > \
at<br> > \
com.myco.myproject.parsers.impl.MetromixParserTest.testParser<br> > \
(MetromixParserTest.java:21)<br> > at \
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)<br>
> at<br>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)<br>
> at<br>
> sun.reflect.DelegatingMethodAccessorImpl.invoke<br>
> (DelegatingMethodAccessorImpl.java:25)<br>
> at java.lang.reflect.Method.invoke(Method.java:597)<br>
> at<br>
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall<br>
> (FrameworkMethod.java:44)<br>
> at<br>
> org.junit.internal.runners.model.ReflectiveCallable.run<br>
> (ReflectiveCallable.java:15)<br>
> at<br>
> org.junit.runners.model.FrameworkMethod.invokeExplosively<br>
> (FrameworkMethod.java:41)<br>
> at<br>
> org.junit.internal.runners.statements.InvokeMethod.evaluate<br>
> (InvokeMethod.java:20)<br>
> at<br>
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)<br>
> at<br>
> org.springframework.test.context.junit4.statements.RunBeforeTestMethodCallbacks.evaluate<br>
> (RunBeforeTestMethodCallbacks.java:74)<br>
> at<br>
> org.springframework.test.context.junit4.statements.RunAfterTestMethodCallbacks.evaluate<br>
> (RunAfterTestMethodCallbacks.java:83)<br>
> at<br>
> org.springframework.test.context.junit4.statements.SpringRepeat.evaluate<br>
> (SpringRepeat.java:72)<br>
> at<br>
> org.springframework.test.context.junit4.SpringJUnit4ClassRunner.runChild<br>
> (SpringJUnit4ClassRunner.java:231)<br>
> at<br>
> org.junit.runners.BlockJUnit4ClassRunner.runChild<br>
> (BlockJUnit4ClassRunner.java:50)<br>
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)<br>
> at \
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)<br> > \
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)<br> > \
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)<br> \
> at \
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)<br> > \
at<br> > org.springframework.test.context.junit4.statements.RunBeforeTestClassCallbacks.evaluate<br>
> (RunBeforeTestClassCallbacks.java:61)<br>
> at<br>
> org.springframework.test.context.junit4.statements.RunAfterTestClassCallbacks.evaluate<br>
> (RunAfterTestClassCallbacks.java:71)<br>
> at org.junit.runners.ParentRunner.run(ParentRunner.java:236)<br>
> at<br>
> org.springframework.test.context.junit4.SpringJUnit4ClassRunner.run<br>
> (SpringJUnit4ClassRunner.java:174)<br>
> at<br>
> org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run<br>
> (JUnit4TestReference.java:50)<br>
> at<br>
> org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)<br>
> at<br>
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests<br>
> (RemoteTestRunner.java:467)<br>
> at<br>
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests<br>
> (RemoteTestRunner.java:683)<br>
> at<br>
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run<br>
> (RemoteTestRunner.java:390)<br>
> at<br>
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main<br>
> (RemoteTestRunner.java:197)<br>
> <br>
> Is there any way to tell the parser to ignore these types of entities
it<br>
> cannot resolve? If not, what resolver do I have to plugin?<br>
> <br>
> Thanks, - Dave<br>
> -- <br>
> View this message in context: </font></tt><a \
href="http://old.nabble.com/Way-it-ignore-"><tt><font \
size=2>http://old.nabble.com/Way-it-ignore-</font></tt></a><tt><font size=2><br> > \
entity-reference-resolving--tp33544935p33544935.html<br> > Sent from the Xerces - \
J - Users mailing list archive at Nabble.com.<br> > <br>
> <br>
> ---------------------------------------------------------------------<br>
> To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org<br>
> For additional commands, e-mail: j-users-help@xerces.apache.org<br>
</font></tt>
--=_alternative 007D4410852579D1_=--
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic