[prev in list] [next in list] [prev in thread] [next in thread] 

List:       xalan-dev
Subject:    [jira] [Commented] (XALANJ-2540) Very inefficient default behaviour
From:       "Murray Williams (JIRA)" <xalan-dev () xml ! apache ! org>
Date:       2011-06-16 16:07:47
Message-ID: 1425571410.11393.1308240467495.JavaMail.tomcat () hel ! zones ! apache ! org
[Download RAW message or body]


    [ https://issues.apache.org/jira/browse/XALANJ-2540?page=com.atlassian.jira.plugin \
.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050520#comment-13050520 ] 

Murray Williams commented on XALANJ-2540:
-----------------------------------------

I get a similar improvement with my app when setting \
-Dcom.sun.org.apache.xml.internal.dtm.DTMManager=com.sun.org.apache.xml.internal.dtm.ref.DTMManagerDefault \


- Thanks!

> Very inefficient default behaviour for looking up DTMManager
> ------------------------------------------------------------
> 
> Key: XALANJ-2540
> URL: https://issues.apache.org/jira/browse/XALANJ-2540
> Project: XalanJ2
> Issue Type: Improvement
> Security Level: No security risk; visible to anyone(Ordinary problems in Xalan \
>                 projects.  Anybody can view the issue.) 
> Components: DTM, XPath
> Affects Versions: 2.7.1, 2.7
> Reporter: Lukas Eder
> 
> I have analysed an issue that has been bothering me for some time. When executing \
> XPath evaluations, it looks like a very significant amount of time is spent in the \
> initialisation of the XPathContext. I have asked this question on Stack Overflow \
> and answered it myself: \
> http://stackoverflow.com/questions/6340802/java-xpath-apache-jaxp-implementation-performance
>  I think the default behaviour of 
> org.apache.xml.dtm.ObjectFactory.lookUpFactoryClassName() is quite sub-optimal and \
> should be improved, statically. I imagine, it is unlikely that this configuration \
> is going to change once classes have been loaded. Hence, the fallback lookup of \
> META-INF/service/org.apache.xml.dtm.DTMManager should only be done once. For \
>                 reference, here's the question and answer again in JIRA:
> ----
> I have come to an astonishing conclusion that this:
> Element e = (Element) document.getElementsByTagName("SomeElementName").item(0);
> String result = ((Element) e).getTextContent();
> Seems to be an incredible 100x faster than this:
> // Accounts for 30%, can be cached
> XPathFactory factory = XPathFactory.newInstance();
> // Negligible
> XPath xpath = factory.newXPath();
> // Accounts for 70% (caching a compiled expression doesn't change much...)
> String result = (String) xpath.evaluate(
> "//SomeElementName", document, XPathConstants.STRING);
> I'm using the JVM's default implementation of JAXP:
> org.apache.xpath.jaxp.XPathFactoryImpl
> org.apache.xpath.jaxp.XPathImpl
> I'm really confused, because it's easy to see how JAXP could optimise the above \
> XPath query to actually execute a simple getElementsByTagName() instead. But it \
> doesn't seem to do that. This problem is limited to around 5-6 frequently used \
> XPath calls, that are abstracted and hidden by an API. Those queries involve simple \
> paths (e.g. /a/b/c, no variables, conditions) against an always available DOM \
> Document only. So, if an optimisation can be done, it will be quite easy to \
>                 achieve.
> ----
> I have debugged and profiled my test-case and Xalan/JAXP in general. I managed to \
> identify the big major problem in \
> org.apache.xml.dtm.ObjectFactory.lookUpFactoryClassName() It can be seen that every \
> one of the 10k test XPath evaluations led to the classloader trying to lookup the \
> DTMManager instance in some sort of default configuration. This configuration is \
> not loaded into memory but accessed every time. Furthermore, this access seems to \
> be protected by a lock on the ObjectFactory.class itself. When the access fails (by \
> default), then the configuration is loaded from the xalan.jar file's \
> META-INF/service/org.apache.xml.dtm.DTMManager configuration file. Every time!:
> Fortunately, this behaviour can be overridden by specifying a JVM parameter like \
>                 this:
> -Dorg.apache.xml.dtm.DTMManager=
> org.apache.xml.dtm.ref.DTMManagerDefault
> or
> -Dcom.sun.org.apache.xml.internal.dtm.DTMManager=
> com.sun.org.apache.xml.internal.dtm.ref.DTMManagerDefault
> So here's a performance improvement overview for 10k consecutive XPath evaluations \
> of //SomeNodeName against a 90k XML file (measured with System.nanoTime(): measured \
>                 library        : Xalan 2.7.0 | Xalan 2.7.1 | Saxon-HE 9.3 | jaxen \
>                 1.1.3   
> --------------------------------------------------------------------------------
> without optimisation    :     10400ms |      4717ms |              |     25500ms
> reusing XPathFactory    :      5995ms |      2829ms |              |
> reusing XPath           :      5900ms |      2890ms |              |
> reusing XPathExpression :      5800ms |      2915ms |      16000ms |     25000ms
> adding the JVM param    :      1163ms |       761ms |        n/a   |

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: xalan-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xalan-dev-help@xml.apache.org


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic