[prev in list] [next in list] [prev in thread] [next in thread] 

List:       xml-cocoon-cvs
Subject:    Re: svn commit: r1878905 - /cocoon/branches/BRANCH_2_1_X/src/java/org/apache/cocoon/generation/Strea
From:       Antonio Gallardo <antonio () apache ! org>
Date:       2021-11-02 4:33:36
Message-ID: 19b69c7a-1cb3-9f1b-153e-ccec00923eb0 () apache ! org
[Download RAW message or body]

Hi all,

It has been long time since I wrote an email in a Cocoon mail list. 
First I want to say hello to all my cocoon fellows.

I am writing, because reviewing this commit, I see we are moving apart 
from the cocoon pools that allows us to control how many instances of a 
given components are being created. Please see my comments between lines 
below.

Best Regards,

Antonio Gallardo.

El 16/6/20 a las 17:04, cdamioli@apache.org escribió:
> Author: cdamioli
> Date: Tue Jun 16 23:04:30 2020
> New Revision: 1878905
> 
> URL: http://svn.apache.org/viewvc?rev=1878905&view=rev
> Log:
> Make StreamGenerator more robust
> 
> Modified:
> cocoon/branches/BRANCH_2_1_X/src/java/org/apache/cocoon/generation/StreamGenerator.java
>  
> Modified: cocoon/branches/BRANCH_2_1_X/src/java/org/apache/cocoon/generation/StreamGenerator.java
>                 
> URL: http://svn.apache.org/viewvc/cocoon/branches/BRANCH_2_1_X/src/java/org/apache/c \
> ocoon/generation/StreamGenerator.java?rev=1878905&r1=1878904&r2=1878905&view=diff \
>                 ==============================================================================
>                 
> --- cocoon/branches/BRANCH_2_1_X/src/java/org/apache/cocoon/generation/StreamGenerator.java \
>                 (original)
> +++ cocoon/branches/BRANCH_2_1_X/src/java/org/apache/cocoon/generation/StreamGenerator.java \
> Tue Jun 16 23:04:30 2020 @@ -16,6 +16,16 @@
> */
> package org.apache.cocoon.generation;
> 
> +import java.io.IOException;
> +import java.io.InputStreamReader;
> +import java.io.Reader;
> +import java.io.StringReader;
> +
> +import javax.servlet.http.HttpServletRequest;
> +import javax.xml.parsers.SAXParser;
> +import javax.xml.parsers.SAXParserFactory;
> +
> +import org.apache.avalon.framework.activity.Initializable;
> import org.apache.cocoon.ProcessingException;
> import org.apache.cocoon.ResourceNotFoundException;
> import org.apache.cocoon.environment.ObjectModelHelper;
> @@ -23,16 +33,11 @@ import org.apache.cocoon.environment.Req
> import org.apache.cocoon.environment.http.HttpEnvironment;
> import org.apache.cocoon.servlet.multipart.Part;
> import org.apache.cocoon.util.PostInputStream;
> -import org.apache.excalibur.xml.sax.SAXParser;
> +import org.xml.sax.ErrorHandler;
> import org.xml.sax.InputSource;
> import org.xml.sax.SAXException;
> -
> -import javax.servlet.http.HttpServletRequest;
> -
> -import java.io.IOException;
> -import java.io.InputStreamReader;
> -import java.io.Reader;
> -import java.io.StringReader;
> +import org.xml.sax.SAXParseException;
> +import org.xml.sax.XMLReader;
> 
> /**
> * @cocoon.sitemap.component.documentation
> @@ -64,7 +69,7 @@ import java.io.StringReader;
> * @author <a href="mailto:Kinga_Dziembowski@hp.com">Kinga Dziembowski</a>
> * @version CVS $Id$
> */
> -public class StreamGenerator extends ServiceableGenerator
> +public class StreamGenerator extends ServiceableGenerator implements Initializable
> {
> 
> /** The parameter holding the name associated with the xml data  **/
> @@ -73,6 +78,9 @@ public class StreamGenerator extends Ser
> /** The input source */
> private InputSource inputSource;
> 
> +    // don't use Excalibur's SAXParser to prevent XXE injection
> +    private SAXParserFactory factory;
> +
> /**
> * Recycle this component.
> * All instance variables are set to <code>null</code>.
> @@ -81,13 +89,20 @@ public class StreamGenerator extends Ser
> super.recycle();
> this.inputSource = null;
> }
> +
> +    public void initialize() throws Exception {
> +        factory = SAXParserFactory.newInstance();
> +        factory.setNamespaceAware(true);
> +        factory.setXIncludeAware(false);
> +        factory.setFeature("http://xml.org/sax/features/external-general-entities", \
> false); +        factory.setFeature("http://xml.org/sax/features/external-parameter-entities", \
> false); +    }
> 
> /**
> * Generate XML data out of request InputStream.
> */
> public void generate()
> throws IOException, SAXException, ProcessingException {
> -        SAXParser parser = null;
> int len = 0;
> String contentType = null;
> 
> @@ -149,9 +164,16 @@ public class StreamGenerator extends Ser
> String charset =  getCharacterEncoding(request, contentType) ;
> if( charset != null) {
> this.inputSource.setEncoding(charset);
> -            }
> -            parser = (SAXParser)this.manager.lookup(SAXParser.ROLE);
> -            parser.parse(this.inputSource, super.xmlConsumer);


The above lines removes the part that was controlled by Cocoon. The 
manager allowed to defined how many instances of the Parser can be 
created. Very useful to handle how much memory the webapp will use.

Please see the place in cocoon.xconf when we define the parsers:
http://svn.apache.org/viewvc/cocoon/branches/BRANCH_2_1_X/src/webapp/WEB-INF/cocoon.xconf?view=markup#l363


Also check the place when we add to the manager the parser that is set 
in cocoon.xconf:
http://svn.apache.org/viewvc/cocoon/branches/BRANCH_2_1_X/src/java/org/apache/cocoon/Cocoon.java?view=markup#l298


I think, there should be a way to configure the new parser in 
cocoon.xconf instead of hard coding it in the stream generator. Please 
note the same code using the line:

parser = (SAXParser)this.manager.lookup(SAXParser.ROLE);

is used in many place in the coocon code.

> +            }
> +
> +            SAXParser parser = factory.newSAXParser();
> +            XMLReader xmlReader = parser.getXMLReader();
> +            xmlReader.setContentHandler(super.xmlConsumer);
> +            xmlReader.setProperty( \
> "http://xml.org/sax/properties/lexical-handler", super.xmlConsumer ); +            \
> xmlReader.setFeature( "http://xml.org/sax/features/namespaces", true ); +
> +            xmlReader.parse(this.inputSource);
> +

I am afraid that with the above code, we loose this control and perhaps 
we are creating a vector to shutdown the server by sending too many 
instances that create a lot of parsers.


> } catch (IOException e) {
> getLogger().error("StreamGenerator.generate()", e);
> throw new ResourceNotFoundException("StreamGenerator could not find resource", e);
> @@ -161,9 +183,7 @@ public class StreamGenerator extends Ser
> } catch (Exception e) {
> getLogger().error("Could not get parser", e);
> throw new ProcessingException("Exception in StreamGenerator.generate()", e);
> -        } finally {
> -            this.manager.release( parser);
> -        }
> +        }
> }
> 
> /**
> @@ -208,7 +228,7 @@ public class StreamGenerator extends Ser
> return extractCharset( contentType, idx );
> }
> }
> -
> +
> 
> protected String extractCharset(String contentType, int idx) {
> String charencoding = null;
> 


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic