[prev in list] [next in list] [prev in thread] [next in thread] 

List:       xerces-j-dev
Subject:    RE: Entity References in attributes
From:       "Paul van der Maas" <paul () stunning-stuff ! com>
Date:       2008-06-05 7:38:06
Message-ID: 001201c8c6df$1adcbe20$50963a60$ () com
[Download RAW message or body]

Hi Michael,

 

Thanks for your reply. I'm sorry about replying directly to you, instead of
to the mailing list. I think I did it right this time. This is the first
time I've signed up for a list, so I didn't really know what I was doing to
be honest.

 

I ended up writing a quick and dirty pre-parser that just goes through the
XSLT and finds the entities that I want to replace before the file is parsed
using Xerces.

It's kind of an ugly solution in my eyes because the XML is not well-formed
if you put it through a parser without putting it through the pre-parser,
but then again, the world is not a perfect place.

 

Thanks for providing me with that explanation [2], I was unable to dig
anything up myself. I can understand the design decision now.

 

Thanks,

 

Paul

 

From: Michael Glavassevich [mailto:mrglavas@ca.ibm.com] 
Sent: Wednesday, June 04, 2008 10:58 PM
To: j-users@xerces.apache.org
Cc: paul@stunning-stuff.com
Subject: RE: Entity References in attributes

 

Hi Paul,

(Replying before this gets lost in my overflowing inbox; should really post
follow-ups to the mailing list for the benefit of others and for you since
other folks might have better answers)

To answer your questions,

> - Do you know of any DOM parsers that would work for me?

I'm not aware of any. I'm not even sure if there's any other widely
available Java implementation which is actively maintained.

> - Can you think of a better way of achieving what I'm 
> trying to achieve?

Can't think of anything which wouldn't involve digging deep into Xerces
internals (assuming it's even possible given the current design).

> - Why are external entities not allowed in attribute values
> and is there a way around this?

There's no way around it. It's disallowed [1] by the XML 1.0 specification.
All XML parsers are required to reject external entity references in
attribute values. Tim Bray (one of the original editors of the XML 1.0 spec)
gives an explanation here [2].

Thanks.

[1] http://www.w3.org/TR/2006/REC-xml-20060816/#NoExternalRefs
[2] http://www.xml.com/axml/notes/NoExtInAttr.html

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org

"Paul van der Maas" <paul@stunning-stuff.com> wrote on 05/28/2008 10:55:11
PM:

> Thanks Michael,
>  
> I really appreciate your answer. It would have taken me ages to 
> figure that out. I hate to ask you this, but  do you know of any DOM
> parsers that do preserve entity references in attribute values. It's
> important for my application.
> I have entities in the XML that are undefined and I pre-parse the 
> document to replace those entities with their values which I look up
> using the entity names. Because of this the parser will also have to
> support the feature that allows you to continue after a fatal error.
> If I can't find a parser that allows me to handle all entities, I 
> guess I'll have to write a simple text-parser that simply looks for 
> entities. The only reason I want to avoid doing this is because I'm 
> afraid to introduce a bug-prone component into my software.
>  
> Perhaps you could shed a light on this. The entities I am trying to 
> replace with this pre-parser are actually localized strings that 
> come from property resource bundles.
> It's my way of localizing XSL templates. An example of a property 
> would be &property:main:home; The name is a URI where the property 
> part is the scheme indicating we want to get the value from a 
> property resource bundle. The main part tells us the property home 
> (which is the last part of the uri) should be retrieved from the 
> main.properties (or any of it's localized versions) file.
>  
> I actually had this working perfectly and completely hack free 
> before by using an EntityResolver that would resolve these entities.
> For instance I would have the entity &home; defined in the internal 
> subset as an external entity like this: <!ENTITY home SYSTEM 
> "property:main#home"> and my EntityResolver would return the right 
> property value from the right property resource bundle. This worked 
> beautifully until I tried using one of these entities in an attribute...
:-(
> Turns out external entities are not allowed in attribute values. Why
> that is, I haven't been able to figure out. It seems to me like a 
> senseless restriction. Do you know why??
>  
> Just to summarize:
> -          Do you know of any DOM parsers that would work for me?
> -          Can you think of a better way of achieving what I'm 
> trying to achieve?
> -          Why are external entities not allowed in attribute values
> and is there a way around this?
>  
> Thanks again,
>  
> Paul van der Maas
>  
> From: Michael Glavassevich [mailto:mrglavas@ca.ibm.com] 
> Sent: Wednesday, May 28, 2008 6:27 AM
> To: j-users@xerces.apache.org
> Cc: paul@stunning-stuff.com
> Subject: Re: Entity References in attributes
>  
> Hi Paul,
> 
> Xerces has no support for preserving entity references in attribute 
> values. See previous discussion on this topic here [1] and here [2].
> 
> Thanks.
> 
> [1] http://marc.info/?t=117027061700003
<http://marc.info/?t=117027061700003&r=1&w=2> &r=1&w=2
> [2] http://issues.apache.org/jira/browse/XERCESJ-1225
> 
> Michael Glavassevich
> XML Parser Development
> IBM Toronto Lab
> E-mail: mrglavas@ca.ibm.com
> E-mail: mrglavas@apache.org
> 
> "Paul van der Maas" <paul@stunning-stuff.com> wrote on 05/28/2008 03:59:02
AM:
> 
> > Hi,
> >  
> > I'm parsing a XSL template into a DOM tree using the DocumentBuilder
> > and after that I visit every Node in the tree using a NodeIterator.
> > I also visit every attribute Node and their values. The reason I go 
> > over the tree is because I need to find all entity references and do
> > something every time I encounter one.
> > I can find all the entity references that are in the document 
> > itself, but it seems as though the DocumentBuilder doesn't generate 
> > any nodes for entity references contained in attributes.
> > For instance, if I have the following XML:
> >  
> >  
> > <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
> > <!DOCTYPE a
> > [
> >   <!ENTITY foo "foo">
> >   <!ENTITY bar "bar">
> > ]>
> > <a attr="before&foo;after">
> >   &bar;
> > </a>
> >  
> > Parsing this document and then using a NodeIterator to visit the 
> > nodes gives me these nodes:
> >  
> > Node a (element)
> > Node attr (attribute)
> > Node beforeafter (text)
> > Node bar (entity reference)
> >  
> > As you see, the entity reference &foo; in between the text before 
> > and after in the attribute attr is nowhere to be found. I can't 
> > figure out where it went.
> > I couldn't find any information about this on Google, that's why I'm
> > turning here before I report this as a bug.
> > Anyone know what is going on? Let me know if you need more 
> > information. Any help is very much appreciated!
> >  
> > Thanks,
> >  
> > Paul van der Maas


[Attachment #3 (text/html)]

<html xmlns:v="urn:schemas-microsoft-com:vml" \
xmlns:o="urn:schemas-microsoft-com:office:office" \
xmlns:w="urn:schemas-microsoft-com:office:word" \
xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" \
xmlns="http://www.w3.org/TR/REC-html40">

<head>
<meta http-equiv=Content-Type content="text/html; charset=us-ascii">
<meta name=Generator content="Microsoft Word 12 (filtered medium)">
<style>
<!--
 /* Font Definitions */
 @font-face
	{font-family:"Cambria Math";
	panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
	{font-family:Calibri;
	panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
	{font-family:Tahoma;
	panose-1:2 11 6 4 3 5 4 4 2 4;}
 /* Style Definitions */
 p.MsoNormal, li.MsoNormal, div.MsoNormal
	{margin:0in;
	margin-bottom:.0001pt;
	font-size:12.0pt;
	font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
	{mso-style-priority:99;
	color:blue;
	text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
	{mso-style-priority:99;
	color:purple;
	text-decoration:underline;}
p
	{mso-style-priority:99;
	mso-margin-top-alt:auto;
	margin-right:0in;
	mso-margin-bottom-alt:auto;
	margin-left:0in;
	font-size:12.0pt;
	font-family:"Times New Roman","serif";}
tt
	{mso-style-priority:99;
	font-family:"Courier New";}
span.EmailStyle19
	{mso-style-type:personal-reply;
	font-family:"Calibri","sans-serif";
	color:#1F497D;}
.MsoChpDefault
	{mso-style-type:export-only;
	font-size:10.0pt;}
@page Section1
	{size:8.5in 11.0in;
	margin:1.0in 1.0in 1.0in 1.0in;}
div.Section1
	{page:Section1;}
-->
</style>
<!--[if gte mso 9]><xml>
 <o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
 <o:shapelayout v:ext="edit">
  <o:idmap v:ext="edit" data="1" />
 </o:shapelayout></xml><![endif]-->
</head>

<body lang=EN-US link=blue vlink=purple>

<div class=Section1>

<p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>Hi Michael,<o:p></o:p></span></p>

<p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'><o:p>&nbsp;</o:p></span></p>

<p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>Thanks for your reply. I&#8217;m sorry about replying directly
to you, instead of to the mailing list. I think I did it right this time. This
is the first time I&#8217;ve signed up for a list, so I didn&#8217;t really
know what I was doing to be honest.<o:p></o:p></span></p>

<p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'><o:p>&nbsp;</o:p></span></p>

<p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>I ended up writing a quick and dirty pre-parser that just goes
through the XSLT and finds the entities that I want to replace before the file
is parsed using Xerces.<o:p></o:p></span></p>

<p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>It&#8217;s kind of an ugly solution in my eyes because the XML
is not well-formed if you put it through a parser without putting it through
the pre-parser, but then again, the world is not a perfect \
place.<o:p></o:p></span></p>

<p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'><o:p>&nbsp;</o:p></span></p>

<p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>Thanks for providing me with that explanation [2], I was unable
to dig anything up myself. I can understand the design decision \
now.<o:p></o:p></span></p>

<p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'><o:p>&nbsp;</o:p></span></p>

<p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>Thanks,<o:p></o:p></span></p>

<p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'><o:p>&nbsp;</o:p></span></p>

<p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>Paul<o:p></o:p></span></p>

<p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'><o:p>&nbsp;</o:p></span></p>

<div>

<div style='border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in'>

<p class=MsoNormal><b><span \
style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'>From:</span></b><span \
style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'> Michael Glavassevich \
[mailto:mrglavas@ca.ibm.com] <br> <b>Sent:</b> Wednesday, June 04, 2008 10:58 PM<br>
<b>To:</b> j-users@xerces.apache.org<br>
<b>Cc:</b> paul@stunning-stuff.com<br>
<b>Subject:</b> RE: Entity References in attributes<o:p></o:p></span></p>

</div>

</div>

<p class=MsoNormal><o:p>&nbsp;</o:p></p>

<p><tt><span style='font-size:10.0pt'>Hi Paul,</span></tt><br>
<br>
<tt><span style='font-size:10.0pt'>(Replying before this gets lost in my
overflowing inbox; should really post follow-ups to the mailing list for the
benefit of others and for you since other folks might have better \
answers)</span></tt><br> <br>
<tt><span style='font-size:10.0pt'>To answer your questions,</span></tt><br>
<br>
<tt><span style='font-size:10.0pt'>&gt; - Do you know of any DOM parsers that
would work for me?</span></tt><br>
<br>
<tt><span style='font-size:10.0pt'>I'm not aware of any. I'm not even sure if
there's any other widely available Java implementation which is actively
maintained.</span></tt><br>
<br>
<tt><span style='font-size:10.0pt'>&gt; - Can you think of a better way of
achieving what I&#8217;m </span></tt><span style='font-size:10.0pt;font-family:
"Courier New"'><br>
<tt>&gt; trying to achieve?</tt></span><br>
<br>
<tt><span style='font-size:10.0pt'>Can't think of anything which wouldn't
involve digging deep into Xerces internals (assuming it's even possible given
the current design).</span></tt><br>
<br>
<tt><span style='font-size:10.0pt'>&gt; - Why are external entities not allowed
in attribute values</span></tt><span style='font-size:10.0pt;font-family:"Courier \
New"'><br> <tt>&gt; and is there a way around this?</tt></span><br>
<br>
<tt><span style='font-size:10.0pt'>There's no way around it. It's disallowed
[1] by the XML 1.0 specification. All XML parsers are required to reject
external entity references in attribute values. Tim Bray (one of the original
editors of the XML 1.0 spec) gives an explanation here [2].</span></tt><br>
<br>
<tt><span style='font-size:10.0pt'>Thanks.</span></tt><br>
<br>
<tt><span style='font-size:10.0pt'>[1] <a
href="http://www.w3.org/TR/2006/REC-xml-20060816/#NoExternalRefs">http://www.w3.org/TR/2006/REC-xml-20060816/#NoExternalRefs</a></span></tt><br>
 <tt><span style='font-size:10.0pt'>[2] <a
href="http://www.xml.com/axml/notes/NoExtInAttr.html">http://www.xml.com/axml/notes/NoExtInAttr.html</a></span></tt><br>
 <br>
<tt><span style='font-size:10.0pt'>Michael Glavassevich</span></tt><span
style='font-size:10.0pt;font-family:"Courier New"'><br>
<tt>XML Parser Development</tt><br>
<tt>IBM Toronto Lab</tt><br>
<tt>E-mail: mrglavas@ca.ibm.com</tt></span><br>
<tt><span style='font-size:10.0pt'>E-mail: mrglavas@apache.org</span></tt><br>
<br>
<tt><span style='font-size:10.0pt'>&quot;Paul van der Maas&quot;
&lt;paul@stunning-stuff.com&gt; wrote on 05/28/2008 10:55:11 PM:</span></tt><span
style='font-size:10.0pt;font-family:"Courier New"'><br>
<br>
<tt>&gt; Thanks Michael,</tt></span><br>
<tt><span style='font-size:10.0pt'>&gt; &nbsp;</span></tt><br>
<tt><span style='font-size:10.0pt'>&gt; I really appreciate your answer. It would
have taken me ages to </span></tt><span style='font-size:10.0pt;font-family:
"Courier New"'><br>
<tt>&gt; figure that out. I hate to ask you this, but &nbsp;do you know of any
DOM</tt><br>
<tt>&gt; parsers that do preserve entity references in attribute values.
It&#8217;s</tt><br>
<tt>&gt; important for my application.</tt></span><br>
<tt><span style='font-size:10.0pt'>&gt; I have entities in the XML that are
undefined and I pre-parse the </span></tt><span style='font-size:10.0pt;
font-family:"Courier New"'><br>
<tt>&gt; document to replace those entities with their values which I look \
up</tt><br> <tt>&gt; using the entity names. Because of this the parser will also \
have to</tt><br> <tt>&gt; support the feature that allows you to continue after a \
fatal error.</tt></span><br> <tt><span style='font-size:10.0pt'>&gt; If I can&#8217;t \
find a parser that allows me to handle all entities, I </span></tt><span \
style='font-size:10.0pt; font-family:"Courier New"'><br>
<tt>&gt; guess I&#8217;ll have to write a simple text-parser that simply looks
for </tt><br>
<tt>&gt; entities. The only reason I want to avoid doing this is because
I&#8217;m </tt><br>
<tt>&gt; afraid to introduce a bug-prone component into my software.</tt></span><br>
<tt><span style='font-size:10.0pt'>&gt; &nbsp;</span></tt><br>
<tt><span style='font-size:10.0pt'>&gt; Perhaps you could shed a light on this.
The entities I am trying to </span></tt><span style='font-size:10.0pt;
font-family:"Courier New"'><br>
<tt>&gt; replace with this pre-parser are actually localized strings that </tt><br>
<tt>&gt; come from property resource bundles.</tt></span><br>
<tt><span style='font-size:10.0pt'>&gt; It&#8217;s my way of localizing XSL
templates. An example of a property </span></tt><span style='font-size:10.0pt;
font-family:"Courier New"'><br>
<tt>&gt; would be &amp;property:main:home; The name is a URI where the property
</tt><br>
<tt>&gt; part is the scheme indicating we want to get the value from a </tt><br>
<tt>&gt; property resource bundle. The main part tells us the property home </tt><br>
<tt>&gt; (which is the last part of the uri) should be retrieved from the </tt><br>
<tt>&gt; main.properties (or any of it&#8217;s localized versions) \
file.</tt></span><br> <tt><span style='font-size:10.0pt'>&gt; &nbsp;</span></tt><br>
<tt><span style='font-size:10.0pt'>&gt; I actually had this working perfectly
and completely hack free </span></tt><span style='font-size:10.0pt;font-family:
"Courier New"'><br>
<tt>&gt; before by using an EntityResolver that would resolve these \
entities.</tt><br> <tt>&gt; For instance I would have the entity &amp;home; defined \
in the internal </tt><br>
<tt>&gt; subset as an external entity like this: &lt;!ENTITY home SYSTEM </tt><br>
<tt>&gt; &#8220;property:main#home&#8221;&gt; and my EntityResolver would
return the right </tt><br>
<tt>&gt; property value from the right property resource bundle. This worked \
</tt><br> <tt>&gt; beautifully until I tried using one of these entities in an
attribute... :-(</tt></span><br>
<tt><span style='font-size:10.0pt'>&gt; Turns out external entities are not
allowed in attribute values. Why</span></tt><span style='font-size:10.0pt;
font-family:"Courier New"'><br>
<tt>&gt; that is, I haven&#8217;t been able to figure out. It seems to me like
a </tt><br>
<tt>&gt; senseless restriction. Do you know why??</tt></span><br>
<tt><span style='font-size:10.0pt'>&gt; &nbsp;</span></tt><br>
<tt><span style='font-size:10.0pt'>&gt; Just to summarize:</span></tt><br>
<tt><span style='font-size:10.0pt'>&gt; - &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Do
you know of any DOM parsers that would work for me?</span></tt><br>
<tt><span style='font-size:10.0pt'>&gt; - &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Can
you think of a better way of achieving what I&#8217;m </span></tt><span
style='font-size:10.0pt;font-family:"Courier New"'><br>
<tt>&gt; trying to achieve?</tt></span><br>
<tt><span style='font-size:10.0pt'>&gt; - &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Why
are external entities not allowed in attribute values</span></tt><span
style='font-size:10.0pt;font-family:"Courier New"'><br>
<tt>&gt; and is there a way around this?</tt></span><br>
<tt><span style='font-size:10.0pt'>&gt; &nbsp;</span></tt><br>
<tt><span style='font-size:10.0pt'>&gt; Thanks again,</span></tt><br>
<tt><span style='font-size:10.0pt'>&gt; &nbsp;</span></tt><br>
<tt><span style='font-size:10.0pt'>&gt; Paul van der Maas</span></tt><br>
<tt><span style='font-size:10.0pt'>&gt; &nbsp;</span></tt><br>
<tt><span style='font-size:10.0pt'>&gt; From: Michael Glavassevich [<a
href="mailto:mrglavas@ca.ibm.com">mailto:mrglavas@ca.ibm.com</a>] </span></tt><span
style='font-size:10.0pt;font-family:"Courier New"'><br>
<tt>&gt; Sent: Wednesday, May 28, 2008 6:27 AM</tt><br>
<tt>&gt; To: j-users@xerces.apache.org</tt><br>
<tt>&gt; Cc: paul@stunning-stuff.com</tt><br>
<tt>&gt; Subject: Re: Entity References in attributes</tt></span><br>
<tt><span style='font-size:10.0pt'>&gt; &nbsp;</span></tt><br>
<tt><span style='font-size:10.0pt'>&gt; Hi Paul,</span></tt><span
style='font-size:10.0pt;font-family:"Courier New"'><br>
<tt>&gt; </tt><br>
<tt>&gt; Xerces has no support for preserving entity references in attribute \
</tt><br> <tt>&gt; values. See previous discussion on this topic here [1] and here \
[2].</tt><br> <tt>&gt; </tt><br>
<tt>&gt; Thanks.</tt><br>
<tt>&gt; </tt><br>
<tt>&gt; [1] <a href="http://marc.info/?t=117027061700003&amp;r=1&amp;w=2">http://marc.info/?t=117027061700003&amp;r=1&amp;w=2</a></tt><br>
 <tt>&gt; [2] <a href="http://issues.apache.org/jira/browse/XERCESJ-1225">http://issues.apache.org/jira/browse/XERCESJ-1225</a></tt><br>
 <tt>&gt; </tt><br>
<tt>&gt; Michael Glavassevich</tt><br>
<tt>&gt; XML Parser Development</tt><br>
<tt>&gt; IBM Toronto Lab</tt><br>
<tt>&gt; E-mail: mrglavas@ca.ibm.com</tt><br>
<tt>&gt; E-mail: mrglavas@apache.org</tt><br>
<tt>&gt; </tt><br>
<tt>&gt; &quot;Paul van der Maas&quot; &lt;paul@stunning-stuff.com&gt; wrote on
05/28/2008 03:59:02 AM:</tt><br>
<tt>&gt; </tt><br>
<tt>&gt; &gt; Hi,</tt><br>
<tt>&gt; &gt; &nbsp;</tt><br>
<tt>&gt; &gt; I&#8217;m parsing a XSL template into a DOM tree using the
DocumentBuilder</tt><br>
<tt>&gt; &gt; and after that I visit every Node in the tree using a
NodeIterator.</tt><br>
<tt>&gt; &gt; I also visit every attribute Node and their values. The reason I
go </tt><br>
<tt>&gt; &gt; over the tree is because I need to find all entity references and
do</tt><br>
<tt>&gt; &gt; something every time I encounter one.</tt><br>
<tt>&gt; &gt; I can find all the entity references that are in the document </tt><br>
<tt>&gt; &gt; itself, but it seems as though the DocumentBuilder doesn&#8217;t
generate </tt><br>
<tt>&gt; &gt; any nodes for entity references contained in attributes.</tt><br>
<tt>&gt; &gt; For instance, if I have the following XML:</tt><br>
<tt>&gt; &gt; &nbsp;</tt><br>
<tt>&gt; &gt; &nbsp;</tt><br>
<tt>&gt; &gt; &lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;
standalone=&quot;yes&quot;?&gt;</tt><br>
<tt>&gt; &gt; &lt;!DOCTYPE a</tt><br>
<tt>&gt; &gt; [</tt><br>
<tt>&gt; &gt; &nbsp; &lt;!ENTITY foo &quot;foo&quot;&gt;</tt><br>
<tt>&gt; &gt; &nbsp; &lt;!ENTITY bar &#8220;bar&#8221;&gt;</tt><br>
<tt>&gt; &gt; ]&gt;</tt><br>
<tt>&gt; &gt; &lt;a attr=&#8221;before&amp;foo;after&#8221;&gt;</tt><br>
<tt>&gt; &gt; &nbsp; &amp;bar;</tt><br>
<tt>&gt; &gt; &lt;/a&gt;</tt><br>
<tt>&gt; &gt; &nbsp;</tt><br>
<tt>&gt; &gt; Parsing this document and then using a NodeIterator to visit the \
</tt><br> <tt>&gt; &gt; nodes gives me these nodes:</tt><br>
<tt>&gt; &gt; &nbsp;</tt><br>
<tt>&gt; &gt; Node a (element)</tt><br>
<tt>&gt; &gt; Node attr (attribute)</tt><br>
<tt>&gt; &gt; Node beforeafter (text)</tt><br>
<tt>&gt; &gt; Node bar (entity reference)</tt><br>
<tt>&gt; &gt; &nbsp;</tt><br>
<tt>&gt; &gt; As you see, the entity reference &amp;foo; in between the text
before </tt><br>
<tt>&gt; &gt; and after in the attribute attr is nowhere to be found. I
can&#8217;t </tt><br>
<tt>&gt; &gt; figure out where it went.</tt><br>
<tt>&gt; &gt; I couldn&#8217;t find any information about this on Google,
that&#8217;s why I&#8217;m</tt><br>
<tt>&gt; &gt; turning here before I report this as a bug.</tt><br>
<tt>&gt; &gt; Anyone know what is going on? Let me know if you need more </tt><br>
<tt>&gt; &gt; information. Any help is very much appreciated!</tt><br>
<tt>&gt; &gt; &nbsp;</tt><br>
<tt>&gt; &gt; Thanks,</tt><br>
<tt>&gt; &gt; &nbsp;</tt><br>
<tt>&gt; &gt; Paul van der Maas</tt></span><o:p></o:p></p>

</div>

</body>

</html>



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic