[prev in list] [next in list] [prev in thread] [next in thread] 

List:       xmlbeans-dev
Subject:    [jira] Resolved: (XMLBEANS-135) bad handling of embeded CDATA
From:       "Radu Preotiuc-Pietro (JIRA)" <xmlbeans-dev () xml ! apache ! org>
Date:       2005-04-25 23:36:35
Message-ID: 625691327.1114472195807.JavaMail.jira () ajax ! apache ! org
[Download RAW message or body]

     [ http://issues.apache.org/jira/browse/XMLBEANS-135?page=all ]
     
Radu Preotiuc-Pietro resolved XMLBEANS-135:
-------------------------------------------

     Resolution: Fixed
    Fix Version: Version 2 Beta 2
                 Version 2
                     (was: TBD)

Implemented the simple fix I was describing on the dev@xmlbeans.apache.org mailing \
list. It's definitely better than what we had and I actually think it covers the \
issue.

> bad handling of embeded CDATA
> -----------------------------
> 
> Key: XMLBEANS-135
> URL: http://issues.apache.org/jira/browse/XMLBEANS-135
> Project: XMLBeans
> Type: Bug
> Versions: Version 1.0.3, Version 2 Beta 1, Version 1.0.4
> Environment: I arrived to it on windows with jdk 1.4.2. 
> Reporter: Martin Hamel
> Fix For: Version 2 Beta 2, Version 2

> 
> I have a case of bad xml. It is an envelope document that includes another 
> document. The parser expect the enclosed document to be in CDATA. The problem 
> is that the second document now include a third document which is also 
> expected to be a CDATA. 
> I create document A with an XMLBean. I put it has a text element of document B 
> after I transformed Document A to a string with xmlText(). I then do the same 
> with document B by putting it in Document C. Everything works well and 
> automatically and it creates CDATA everytime it needs to.
> //fragment
> XmlOptions options = new XmlOptions();
> options.setSavePrettyPrint();
> Field field = getAssessmentFields().addNewField();
> field.setFieldName("AssessmentContent");
> field.setFieldValue(answersDocument.xmlText(options));
> ..
> The problem is that on the second escaping the CDATA end ([[>)is escaped to 
> "&gt;". The SAX parser that read all this (Xalan) just can't do it. Also, the 
> specification says that there should not be any CDATA containing a CDATA.
> Here is the modification I made for embeded CDATA. Do you think that would be 
> worty of beeing included?
> here is the entitizeContent method in Saver.java:
> Pattern cdataPattern = Pattern.compile("CDATA");
> private void entitizeContent ( )
> {
> if (_lastEmitCch == 0)
> return;
> int i = _lastEmitIn;
> final int n = _buf.length;
> boolean hasOutOfRange = false;
> 
> int count = 0;
> for ( int cch = _lastEmitCch ; cch > 0 ; cch-- )
> {                
> char ch = _buf[ i ];
> if (ch == '<' || ch == '&')
> count++;
> else if (isBadChar( ch ))
> hasOutOfRange = true;
> if (++i == n)
> i = 0;
> }
> if (count == 0 && !hasOutOfRange)
> return;
> i = _lastEmitIn;
> //
> // Heuristic for knowing when to save out stuff as a CDATA.
> //
> 
> // Well check if we have a cdata in the buffer.
> // If we do, we won't nest another one.
> CharBuffer charBuffer = CharBuffer.wrap(_buf);
> boolean hasCDATA = cdataPattern.matcher(charBuffer).find();            
> if (_lastEmitCch > 32 && count > 5 &&
> count * 100 / _lastEmitCch > 1 && !hasCDATA)
> { 
> boolean lastWasBracket = _buf[ i ] == ']';
> i = replace( i, "<![CDATA[" + _buf[ i ] );
> boolean secondToLastWasBracket = lastWasBracket;
> lastWasBracket = _buf[ i ] == ']';
> if (++i == _buf.length)
> i = 0;
> for ( int cch = _lastEmitCch ; cch > 0 ; cch-- )
> {
> char ch = _buf[ i ];
> if (ch == '>' && secondToLastWasBracket && lastWasBracket)
> i = replace( i, "&gt;" );
> else if (isBadChar( ch ))
> i = replace( i, "?" );
> else
> i++;
> secondToLastWasBracket = lastWasBracket;
> lastWasBracket = ch == ']';
> if (i == _buf.length)
> i = 0;
> }
> emit( "]]>" );
> }
> else
> {
> for ( int cch = _lastEmitCch ; cch > 0 ; cch-- )
> {
> char ch = _buf[ i ];
> if (ch == '<')
> i = replace( i, "&lt;" );
> else if (hasCDATA && ch == '>')
> i = replace(i, "&gt;");
> else if (ch == '&')
> i = replace( i, "&amp;" );
> else if (isBadChar( ch ))
> i = replace( i, "?" );
> else
> i++;
> if (i == _buf.length)
> i = 0;
> }
> }
> }

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@xmlbeans.apache.org
For additional commands, e-mail: dev-help@xmlbeans.apache.org


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic