[prev in list] [next in list] [prev in thread] [next in thread]
List: xalan-dev
Subject: [jira] [Commented] (XALANJ-2540) Very inefficient default behaviour for looking up DTMManager
From: "Matthew Broadhead (JIRA)" <jira () apache ! org>
Date: 2018-05-23 9:21:00
Message-ID: JIRA.12510313.1308059788000.13320.1527067260491 () Atlassian ! JIRA
[Download RAW message or body]
[ https://issues.apache.org/jira/browse/XALANJ-2540?page=com.atlassian.jira.plugin \
.system.issuetabpanels:comment-tabpanel&focusedCommentId=16486962#comment-16486962 ]
Matthew Broadhead commented on XALANJ-2540:
-------------------------------------------
[~garydgregory] under your recommendation i have cloned 2.7.1 as that is the highest \
maintenance release i can see. i have greped for 2.7.3 but cannot see that \
mentioned in any of the files
There is no pom.xml or anything so it looks like manual building?
[~msahyoun] thanks i have looked in ObjectFactory and see where it doing \
lookUpFactoryClassName(). do you think it is possible to cache the result into a \
Singleton for future requests? or might this cause clashes?
I could submit a patch for Singleton suggestion but I am not sure how to build and \
deploy the project
> Very inefficient default behaviour for looking up DTMManager
> ------------------------------------------------------------
>
> Key: XALANJ-2540
> URL: https://issues.apache.org/jira/browse/XALANJ-2540
> Project: XalanJ2
> Issue Type: Improvement
> Security Level: No security risk; visible to anyone(Ordinary problems in Xalan \
> projects. Anybody can view the issue.)
> Components: DTM, XPath
> Affects Versions: 2.7.1, 2.7
> Reporter: Lukas Eder
> Priority: Major
>
> I have analysed an issue that has been bothering me for some time. When executing \
> XPath evaluations, it looks like a very significant amount of time is spent in the \
> initialisation of the XPathContext. I have asked this question on Stack Overflow \
> and answered it myself: \
> http://stackoverflow.com/questions/6340802/java-xpath-apache-jaxp-implementation-performance
> I think the default behaviour of
> org.apache.xml.dtm.ObjectFactory.lookUpFactoryClassName() is quite sub-optimal and \
> should be improved, statically. I imagine, it is unlikely that this configuration \
> is going to change once classes have been loaded. Hence, the fallback lookup of \
> META-INF/service/org.apache.xml.dtm.DTMManager should only be done once. For \
> reference, here's the question and answer again in JIRA:
> ----
> I have come to an astonishing conclusion that this:
> Element e = (Element) document.getElementsByTagName("SomeElementName").item(0);
> String result = ((Element) e).getTextContent();
> Seems to be an incredible 100x faster than this:
> // Accounts for 30%, can be cached
> XPathFactory factory = XPathFactory.newInstance();
> // Negligible
> XPath xpath = factory.newXPath();
> // Accounts for 70% (caching a compiled expression doesn't change much...)
> String result = (String) xpath.evaluate(
> "//SomeElementName", document, XPathConstants.STRING);
> I'm using the JVM's default implementation of JAXP:
> org.apache.xpath.jaxp.XPathFactoryImpl
> org.apache.xpath.jaxp.XPathImpl
> I'm really confused, because it's easy to see how JAXP could optimise the above \
> XPath query to actually execute a simple getElementsByTagName() instead. But it \
> doesn't seem to do that. This problem is limited to around 5-6 frequently used \
> XPath calls, that are abstracted and hidden by an API. Those queries involve simple \
> paths (e.g. /a/b/c, no variables, conditions) against an always available DOM \
> Document only. So, if an optimisation can be done, it will be quite easy to \
> achieve.
> ----
> I have debugged and profiled my test-case and Xalan/JAXP in general. I managed to \
> identify the big major problem in \
> org.apache.xml.dtm.ObjectFactory.lookUpFactoryClassName() It can be seen that every \
> one of the 10k test XPath evaluations led to the classloader trying to lookup the \
> DTMManager instance in some sort of default configuration. This configuration is \
> not loaded into memory but accessed every time. Furthermore, this access seems to \
> be protected by a lock on the ObjectFactory.class itself. When the access fails (by \
> default), then the configuration is loaded from the xalan.jar file's \
> META-INF/service/org.apache.xml.dtm.DTMManager configuration file. Every time!:
> Fortunately, this behaviour can be overridden by specifying a JVM parameter like \
> this:
> -Dorg.apache.xml.dtm.DTMManager=
> org.apache.xml.dtm.ref.DTMManagerDefault
> or
> -Dcom.sun.org.apache.xml.internal.dtm.DTMManager=
> com.sun.org.apache.xml.internal.dtm.ref.DTMManagerDefault
> So here's a performance improvement overview for 10k consecutive XPath evaluations \
> of //SomeNodeName against a 90k XML file (measured with System.nanoTime(): measured \
> library : Xalan 2.7.0 | Xalan 2.7.1 | Saxon-HE 9.3 | jaxen \
> 1.1.3
> --------------------------------------------------------------------------------
> without optimisation : 10400ms | 4717ms | | 25500ms
> reusing XPathFactory : 5995ms | 2829ms | |
> reusing XPath : 5900ms | 2890ms | |
> reusing XPathExpression : 5800ms | 2915ms | 16000ms | 25000ms
> adding the JVM param : 1163ms | 761ms | n/a |
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@xalan.apache.org
For additional commands, e-mail: dev-help@xalan.apache.org
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic