From xml-dev Wed Jan 03 10:17:39 2007 From: Frans Englich Date: Wed, 03 Jan 2007 10:17:39 +0000 To: xml-dev Subject: Re: [xml-dev] What to escape when serializing XML Message-Id: <200701031117.39199.frans.englich () telia ! com> X-MARC-Message: https://marc.info/?l=xml-dev&m=146143170212440 On Wednesday 03 January 2007 10:00, Henri Sivonen wrote: > On Jan 2, 2007, at 17:11, Pete Cordell wrote: > > In terms of end-of-line encoding, the approach seems to be to > > output what is convenient (CR, LF, or CRLF) and have the receiving > > application sort out the situation. So let me summarize. This needs to be escaped when serializing XML 1.0 content without taking into account XML 1.1 compatibility but with the purpose of being able to roundtrip the content being serialized: * Required characters like '<' and '&', etc. * Characters unable to be represented in the given encoding * Whitespace except 0x20 in attributes since parsers do Attribute Value Normalization * End of line characters since the parser normalizes those as well(2.11 End-of-Line Handling) Is that all? XSLT 2.0 and XQuery 1.0 Serialization hints there is more. It says "Specifically, CR, NEL and LINE SE ...". Note the use of the word "specifically". And what is the reason to that it requires "#x7F through #x9F in text nodes and attribute nodes MUST be output as character references"? It seems the XML 1.0 specification has the perspective of an XML consumer, not producer. Cheers, Frans _______________________________________________________________________ XML-DEV is a publicly archived, unmoderated list hosted by OASIS to support XML implementation and development. To minimize spam in the archives, you must subscribe before posting. [Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/ Or unsubscribe: xml-dev-unsubscribe@lists.xml.org subscribe: xml-dev-subscribe@lists.xml.org List archive: http://lists.xml.org/archives/xml-dev/ List Guidelines: http://www.oasis-open.org/maillists/guidelines.php