[prev in list] [next in list] [prev in thread] [next in thread]
List: xml-dev
Subject: Re: [xml-dev] What to escape when serializing XML
From: Henri Sivonen <hsivonen () iki ! fi>
Date: 2007-01-03 9:00:16
Message-ID: 422862D4-4220-4AB2-8D2B-B74315E20F1A () iki ! fi
[Download RAW message or body]
On Jan 2, 2007, at 17:11, Pete Cordell wrote:
> In terms of end-of-line encoding, the approach seems to be to
> output what is convenient (CR, LF, or CRLF) and have the receiving
> application sort out the situation.
More to the point, the LF character in element content can be
serialized as CR, LF or CRLF. Of course, LF is the most natural
serialization.
In order to avoid dataloss, LF, CR and tab need to be escaped in
attribute values. Otherwise they are normalized to space by the
parser. This matters for example when round-tripping multiline values
in XHTML <input type='hidden'/>.
> Conceptually, the receiving XML processor should normalize the end-
> of-line markers to 0x0A and then the application converts that to
> which ever of CR, LF, or CRLF is appropriate.
For this reason, in order to avoid dataloss, CR needs to be escaped
as an NCR to make it survive serialization and parsing round-tripping.
--
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/
_______________________________________________________________________
XML-DEV is a publicly archived, unmoderated list hosted by OASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.
[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/
Or unsubscribe: xml-dev-unsubscribe@lists.xml.org
subscribe: xml-dev-subscribe@lists.xml.org
List archive: http://lists.xml.org/archives/xml-dev/
List Guidelines: http://www.oasis-open.org/maillists/guidelines.php
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic