[prev in list] [next in list] [prev in thread] [next in thread] 

List:       xmlbeans-dev
Subject:    Re: Entitized characters in attribute values
From:       David Waite <mass () akuma ! org>
Date:       2009-07-13 7:33:53
Message-ID: 9D045F39-C578-4309-908D-5AB4EB0B242F () akuma ! org
[Download RAW message or body]

The textual contents of an element is controlled by the CharData  
symbol ( http://www.w3.org/TR/xml/#NT-CharData ):
[14]   	CharData	   ::=   	[^<&]* - ([^<&]* ']]>' [^<&]*)
Only '<', and '&' are disallowed (as they indicate the start of other  
XML symbols) , so quotes, double-quotes and greater than ('>') symbols  
do not require any encoding.

Similarly for AttValue ( http://www.w3.org/TR/xml/#NT-AttValue ):
[10]   	AttValue	   ::=   	'"' ([^<&"] | Reference)* '"'
> "'" ([^<&'] | Reference)* "'"
Only '<', '&' and the particular quoting character used to surround  
the attribute value are disallowed; the other quote character and  
greater than ('>') do not require any encoding.

Of course these characters could be encoded (just as any character  
could be specified with a numeric character reference) - there just  
isn't a need to, and it increases the document size and processing  
required.

-DW

On Jul 13, 2009, at 1:09 AM, sendy wrote:

> 
> Hi everybody,
> 
> It seems that special characters < and & are entitized as &lt; and  
> &amp;
> respectively, but not >.
> Is there any reason to that? According to xml specs, or at least to my
> understanding of it :-) , shouldn't > be entitized even when in  
> attribute
> value?
> 
> Thanks,
> Sendy
> -- 
> View this message in context: \
> http://www.nabble.com/Entitized-characters-in-attribute-values-tp24457087p24457087.html
>  Sent from the Xml Beans - Dev mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@xmlbeans.apache.org
> For additional commands, e-mail: dev-help@xmlbeans.apache.org
> 


[Attachment #3 (text/html)]

<html><body style="word-wrap: break-word; -webkit-nbsp-mode: space; \
-webkit-line-break: after-white-space; ">The textual contents of an element is \
controlled by the CharData symbol (&nbsp;<a \
href="http://www.w3.org/TR/xml/#NT-CharData">http://www.w3.org/TR/xml/#NT-CharData</a> \
):<div><span class="Apple-style-span" style="font-family: sans-serif; "><a \
name="d0e1106" id="d0e1106"><table class="scrap" summary="Scrap" style="position: \
static; z-index: auto; "><tbody><tr valign="baseline"><td style="font-family: \
sans-serif; "><a name="NT-CharData" \
id="NT-CharData">[14]&nbsp;&nbsp;&nbsp;</a></td><td style="font-family: sans-serif; \
"><code style="font-family: monospace; ">CharData</code></td><td style="font-family: \
sans-serif; ">&nbsp;&nbsp;&nbsp;::=&nbsp;&nbsp;&nbsp;</td><td style="font-family: \
sans-serif; "><code style="font-family: monospace; ">[^&lt;&amp;]* - ([^&lt;&amp;]* \
']]&gt;' [^&lt;&amp;]*)</code></td></tr></tbody></table></a></span><div>Only '&lt;', \
and '&amp;' are disallowed (as they indicate the start of other XML symbols) , so \
quotes, double-quotes and greater than ('&gt;') symbols do not require any \
encoding.</div><div><br></div><div>Similarly for AttValue (&nbsp;<a \
href="http://www.w3.org/TR/xml/#NT-AttValue">http://www.w3.org/TR/xml/#NT-AttValue</a> \
):</div><div><span class="Apple-style-span" style="font-family: sans-serif; "><a \
name="d0e888" id="d0e888"><table class="scrap" summary="Scrap" style="position: \
static; z-index: auto; "><tbody><tr valign="baseline"><td style="font-family: \
sans-serif; "><a name="NT-AttValue" \
id="NT-AttValue">[10]&nbsp;&nbsp;&nbsp;</a></td><td style="font-family: sans-serif; \
"><code style="font-family: monospace; ">AttValue</code></td><td style="font-family: \
sans-serif; ">&nbsp;&nbsp;&nbsp;::=&nbsp;&nbsp;&nbsp;</td><td style="font-family: \
sans-serif; "><code style="font-family: monospace; ">'"' ([^&lt;&amp;"] |&nbsp;<a \
href="http://www.w3.org/TR/xml/#NT-Reference" style="color: rgb(0, 0, 204); \
background-image: initial; background-repeat: initial; background-attachment: \
initial; -webkit-background-clip: initial; -webkit-background-origin: initial; \
background-color: transparent; background-position: initial initial; \
">Reference</a>)* '"'</code></td></tr><tr valign="baseline"><td style="font-family: \
sans-serif; "></td><td style="font-family: sans-serif; "></td><td style="font-family: \
sans-serif; "></td><td style="font-family: sans-serif; "><code style="font-family: \
monospace; ">|&nbsp; "'" ([^&lt;&amp;'] |&nbsp;<a \
href="http://www.w3.org/TR/xml/#NT-Reference" style="color: rgb(0, 0, 204); \
background-image: initial; background-repeat: initial; background-attachment: \
initial; -webkit-background-clip: initial; -webkit-background-origin: initial; \
background-color: transparent; background-position: initial initial; \
">Reference</a>)* "'"</code></td></tr></tbody></table></a></span></div><div>Only \
'&lt;', '&amp;' and the particular quoting character used to surround the attribute \
value are disallowed; the other quote character and greater than ('&gt;') do not \
require any encoding.</div><div><br></div><div>Of course these characters could be \
encoded (just as any character could be specified with a numeric character reference) \
- there just isn't a need to, and it increases the document size and processing \
required.</div><div><br></div><div>-DW</div><div>&nbsp;<br><div><div>On Jul 13, 2009, \
at 1:09 AM, sendy wrote:</div><br class="Apple-interchange-newline"><blockquote \
type="cite"><div><br>Hi everybody,<br><br>It seems that special characters &lt; and \
&amp; are entitized as &amp;lt; and &amp;amp;<br>respectively, but not &gt;.<br>Is \
there any reason to that? According to xml specs, or at least to my<br>understanding \
of it :-) , shouldn't &gt; be entitized even when in \
attribute<br>value?<br><br>Thanks,<br>Sendy<br>-- <br>View this message in context: \
<a href="http://www.nabble.com/Entitized-characters-in-attribute-values-tp24457087p244 \
57087.html">http://www.nabble.com/Entitized-characters-in-attribute-values-tp24457087p24457087.html</a><br>Sent \
from the Xml Beans - Dev mailing list archive at \
Nabble.com.<br><br><br>---------------------------------------------------------------------<br>To \
unsubscribe, e-mail: <a \
href="mailto:dev-unsubscribe@xmlbeans.apache.org">dev-unsubscribe@xmlbeans.apache.org</a><br>For \
additional commands, e-mail: <a \
href="mailto:dev-help@xmlbeans.apache.org">dev-help@xmlbeans.apache.org</a><br><br></div></blockquote></div><br></div></div></body></html>




[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic