[prev in list] [next in list] [prev in thread] [next in thread] 

List:       xalan-j-users
Subject:    RE: insert carriage return
From:       "Koes, Derrick" <Derrick.Koes () smith-nephew ! com>
Date:       2004-04-21 19:11:16
Message-ID: 90BB627CA578DC479862B41DBD542991098D3DB8 () andsrv31 ! smith-nephew ! com
[Download RAW message or body]


I've convinced myself that '\n' is the right answer.  Thanks to everyone who
offered assistance.

Interestingly, '\u000A' is a java error--"unclosed character literal", hence
my mistake using '\u0010'.

I learned two things today.

1.  If you use an html textarea for input data, posting the data to the
server (through jsp) and repopulating the page via an XSLT transform, strip
the data of any '\r' characters.
2.  If you serialized your XML and store it in a clob column in a sql
database, do not use the readLine method of BufferedReader to read the clob
character stream (to retrieve it from the clob for the purposes of parsing
it).  

Corollary:  Bug 23147 in the Xalan bugzilla database is not a bug.  I
believe that carriage return handling changed at some point after Xalan
2.4.1 in order to better match the specification.  If you built an
application with an earlier version of Xalan, and your data uses carriage
returns, be very careful when upgrading and see points 1 and 2 above.


-----Original Message-----
From: Joseph Kesselman [mailto:keshlam@us.ibm.com] 
Sent: Wednesday, April 21, 2004 9:20 AM
To: xalan-j-users@xml.apache.org
Subject: RE: insert carriage return





> Hex 0x10 isn't a line feed (\n); *decimal* 10 (0xA) is.

That's correct.

> All whitespace (including \n) gets normalized to spaces.

*Not* correct, when you're working with the APIs. If you feed an XML-legal
character into the DOM or SAX, it should be converted to numeric character
reference automatically by the serializer when you write it out as XML --
just as the parser converts the numeric character reference into that
character when you read it back in (modulo attribute-value-normalization
rules).

>>test.appendChild(doc.createTextNode("\u0010"));
>Replace "\u0010" with "&#xA;".

Also not correct. If you do so, the serializer will escape the & character;
you'll wind up with the text string "&amp;xa;" or equivalent.

You really should be able to use \n, or \u000A, and have the Right Thing
happen. If it doesn't, something is broken -- either the serializer isn't
writing the data out correctly, or whoever's downstream isn't reading it
back in correctly.

But you do have to distinguish decimal from hex!
This electronic transmission is strictly confidential to Smith & Nephew and
intended solely for the addressee.  It may contain information which is
covered by legal, professional or other privilege.  If you are not the
intended addressee, or someone authorized by the intended addressee to
receive transmissions on behalf of the addressee, you must not retain,
disclose in any form, copy or take any action in reliance on this
transmission.  If you have received this transmission in error, please
notify the sender as soon as possible and destroy this message.
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic