[prev in list] [next in list] [prev in thread] [next in thread] 

List:       grinder-use
Subject:    Re: [Grinder-use] Python re versus Java Pattern
From:       olivier merlin <omerlin13 () gmail ! com>
Date:       2011-12-02 15:08:51
Message-ID: CAFjxuZNHMn5hcbZjGcJUHsUVKLWAve_0DUHJPVm56Q_PvZgT=Q () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


Hi Richard,

Response is : c) !

I have done some test because i use a lot of re module myself in Grinder.
I get the same results than yours.
The re module search function is incredibly long ... because of strings
manipulation which are very very long. (i have done some profile)


With Jython:
         201 function calls (189 primitive calls) in 0.952 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.922    0.922 <string>:0(?)
        0    0.000             0.000          profile:0(profiler)
        1    0.030    0.030    0.952    0.952
profile:0(testref(homepage_html))
        1    0.922    0.922    0.922    0.922 pvj.py:9(testref)
        1    0.000    0.000    0.000    0.000 sre.py:178(compile)
        1    0.000    0.000    0.000    0.000 sre.py:217(_compile)
      6/1    0.000    0.000    0.000    0.000 sre_compile.py:21(_compile)
        4    0.000    0.000    0.000    0.000 sre_compile.py:296(_simple)
        1    0.000    0.000    0.000    0.000
sre_compile.py:303(_compile_info)
        1    0.000    0.000    0.000    0.000 sre_compile.py:412(_code)
        1    0.000    0.000    0.000    0.000 sre_compile.py:427(compile)
       12    0.000    0.000    0.000    0.000 sre_parse.py:133(__len__)
       22    0.000    0.000    0.000    0.000 sre_parse.py:137(__getitem__)
        4    0.000    0.000    0.000    0.000 sre_parse.py:139(__setitem__)
        4    0.000    0.000    0.000    0.000 sre_parse.py:141(__getslice__)
       31    0.000    0.000    0.000    0.000 sre_parse.py:145(append)
     10/5    0.000    0.000    0.000    0.000 sre_parse.py:147(getwidth)
        1    0.000    0.000    0.000    0.000 sre_parse.py:183(__init__)
       39    0.000    0.000    0.000    0.000
sre_parse.py:187(_Tokenizer__next)
        9    0.000    0.000    0.000    0.000 sre_parse.py:200(match)
       37    0.000    0.000    0.000    0.000 sre_parse.py:206(get)
      2/1    0.000    0.000    0.000    0.000 sre_parse.py:313(_parse_sub)
      2/1    0.000    0.000    0.000    0.000 sre_parse.py:368(_parse)
        1    0.000    0.000    0.000    0.000 sre_parse.py:613(parse)
        1    0.000    0.000    0.000    0.000 sre_parse.py:75(__init__)
        1    0.000    0.000    0.000    0.000 sre_parse.py:80(opengroup)
        1    0.000    0.000    0.000    0.000 sre_parse.py:91(closegroup)
        6    0.000    0.000    0.000    0.000 sre_parse.py:98(__init__)

with python:
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      120    0.000    0.000    0.000    0.000 :0(append)
        1    0.000    0.000    0.000    0.000 :0(compile)
        1    0.000    0.000    0.000    0.000 :0(get)
       26    0.000    0.000    0.000    0.000 :0(getlower)
        1    0.000    0.000    0.000    0.000 :0(group)
       27    0.000    0.000    0.000    0.000 :0(isinstance)
        1    0.000    0.000    0.000    0.000 :0(items)
  108/104    0.000    0.000    0.000    0.000 :0(len)
       12    0.000    0.000    0.000    0.000 :0(min)
       26    0.000    0.000    0.000    0.000 :0(ord)
        1    0.000    0.000    0.000    0.000 :0(remove)
        1    0.739    0.739    0.739    0.739 :0(search)
        1    0.002    0.002    0.002    0.002 :0(setprofile)
        1    0.000    0.000    0.743    0.743 <string>:1(<module>)
        0    0.000             0.000          profile:0(profiler)
        1    0.000    0.000    0.744    0.744
profile:0(testref(homepage_html))
        1    0.000    0.000    0.743    0.743 pvj.py:9(testref)
        1    0.000    0.000    0.004    0.004 re.py:188(compile)
        1    0.000    0.000    0.004    0.004 re.py:229(_compile)
      6/1    0.001    0.000    0.001    0.001 sre_compile.py:21(_compile)
        4    0.000    0.000    0.000    0.000 sre_compile.py:296(_simple)
        1    0.000    0.000    0.000    0.000
sre_compile.py:303(_compile_info)
        1    0.000    0.000    0.001    0.001 sre_compile.py:412(_code)
        1    0.000    0.000    0.004    0.004 sre_compile.py:427(compile)
       16    0.000    0.000    0.000    0.000 sre_parse.py:126(__len__)
       26    0.000    0.000    0.000    0.000 sre_parse.py:130(__getitem__)
        4    0.000    0.000    0.000    0.000 sre_parse.py:134(__setitem__)
       31    0.000    0.000    0.000    0.000 sre_parse.py:138(append)
     10/5    0.000    0.000    0.000    0.000 sre_parse.py:140(getwidth)
        1    0.000    0.000    0.000    0.000 sre_parse.py:178(__init__)
       39    0.000    0.000    0.001    0.000 sre_parse.py:182(__next)
        9    0.000    0.000    0.000    0.000 sre_parse.py:195(match)
       37    0.000    0.000    0.001    0.000 sre_parse.py:201(get)
      2/1    0.000    0.000    0.002    0.002 sre_parse.py:301(_parse_sub)
      2/1    0.001    0.000    0.002    0.002 sre_parse.py:379(_parse)
        1    0.000    0.000    0.002    0.002 sre_parse.py:663(parse)
        1    0.000    0.000    0.000    0.000 sre_parse.py:67(__init__)
        1    0.000    0.000    0.000    0.000 sre_parse.py:72(opengroup)
        1    0.000    0.000    0.000    0.000 sre_parse.py:83(closegroup)
        6    0.000    0.000    0.000    0.000 sre_parse.py:90(__init__)

The problem is not related to python/jython
The problem is really String manipulation in python in general.

The Java implementation is extremely fast and it's not a matter of 500
times but much more !

Conclusion: i will reuse ONLY java when dealing with regex in
Jython/Grinder.

Many thanks
Olivier


2011/11/30 Richard Lynch <Richard.Lynch@rasmussen.edu>

> So I was using java.util.regex, but saw an article somewhere about using
> Python's libraries such as 're'.
>
> Philip Aston's help (thanks!) made it possible to use Python's 're, but...
>
> At least one of the following statements must be true:
>
> a) I'm doing something incredibly stupid
> b) The Grinder is broken
> c) Python's 're' (called via ginder->java->jython.jar) is 500+ *times*
> *slower* than Java Pattern
>
> See my TestRunner and properties and output at:
> http://www.6112northwolcott.com/pvj/
>
> Does grinder->java->jython->python really add that much overhead? A factor
> of 500 for the "same" functionality?..
>
> Or is Python really inherently that much slower than Java?
>
> NOTE:
> I'm a PHP guy, completely out of my element, so a) is the most likely
> answer, even if I can't see what I've done stupidly at the moment...
>
>
>
> ------------------------------------------------------------------------------
> All the data continuously generated in your IT infrastructure
> contains a definitive record of customers, application performance,
> security threats, fraudulent activity, and more. Splunk takes this
> data and makes sense of it. IT sense. And common sense.
> http://p.sf.net/sfu/splunk-novd2d
> _______________________________________________
> grinder-use mailing list
> grinder-use@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/grinder-use
>

[Attachment #5 (text/html)]

Hi Richard,<br><br>Response is : c) !<br><br>I have done some test because i use a \
lot of re module myself in Grinder.<br>I get the same results than yours.<br>The re \
module search function is incredibly long ... because of strings manipulation which \
are very very long. (i have done some profile)<br> <br>
<br>With Jython:<br><font size="1"><span style="font-family: courier new,monospace;"> \
201 function calls (189 primitive calls) in 0.952 CPU seconds</span><br \
style="font-family: courier new,monospace;"><br style="font-family: courier \
new,monospace;"> <span style="font-family: courier new,monospace;">   Ordered by: \
standard name</span><br style="font-family: courier new,monospace;"><br \
style="font-family: courier new,monospace;"><span style="font-family: courier \
new,monospace;">   ncalls  tottime  percall  cumtime  percall \
filename:lineno(function)</span><br style="font-family: courier new,monospace;"> \
<span style="font-family: courier new,monospace;">        1    0.000    0.000    \
0.922    0.922 &lt;string&gt;:0(?)</span><br style="font-family: courier \
new,monospace;"><span style="font-family: courier new,monospace;">        0    0.000  \
0.000          profile:0(profiler)</span><br style="font-family: courier \
new,monospace;"> <span style="font-family: courier new,monospace;">        1    0.030 \
0.030    0.952    0.952 profile:0(testref(homepage_html))</span><br \
style="font-family: courier new,monospace;"><span style="font-family: courier \
new,monospace;">        1    0.922    0.922    0.922    0.922 \
pvj.py:9(testref)</span><br style="font-family: courier new,monospace;"> <span \
style="font-family: courier new,monospace;">        1    0.000    0.000    0.000    \
0.000 sre.py:178(compile)</span><br style="font-family: courier new,monospace;"><span \
style="font-family: courier new,monospace;">        1    0.000    0.000    0.000    \
0.000 sre.py:217(_compile)</span><br style="font-family: courier new,monospace;"> \
<span style="font-family: courier new,monospace;">      6/1    0.000    0.000    \
0.000    0.000 sre_compile.py:21(_compile)</span><br style="font-family: courier \
new,monospace;"><span style="font-family: courier new,monospace;">        4    0.000  \
0.000    0.000    0.000 sre_compile.py:296(_simple)</span><br style="font-family: \
courier new,monospace;"> <span style="font-family: courier new,monospace;">        1  \
0.000    0.000    0.000    0.000 sre_compile.py:303(_compile_info)</span><br \
style="font-family: courier new,monospace;"><span style="font-family: courier \
new,monospace;">        1    0.000    0.000    0.000    0.000 \
sre_compile.py:412(_code)</span><br style="font-family: courier new,monospace;"> \
<span style="font-family: courier new,monospace;">        1    0.000    0.000    \
0.000    0.000 sre_compile.py:427(compile)</span><br style="font-family: courier \
new,monospace;"><span style="font-family: courier new,monospace;">       12    0.000  \
0.000    0.000    0.000 sre_parse.py:133(__len__)</span><br style="font-family: \
courier new,monospace;"> <span style="font-family: courier new,monospace;">       22  \
0.000    0.000    0.000    0.000 sre_parse.py:137(__getitem__)</span><br \
style="font-family: courier new,monospace;"><span style="font-family: courier \
new,monospace;">        4    0.000    0.000    0.000    0.000 \
sre_parse.py:139(__setitem__)</span><br style="font-family: courier new,monospace;"> \
<span style="font-family: courier new,monospace;">        4    0.000    0.000    \
0.000    0.000 sre_parse.py:141(__getslice__)</span><br style="font-family: courier \
new,monospace;"><span style="font-family: courier new,monospace;">       31    0.000  \
0.000    0.000    0.000 sre_parse.py:145(append)</span><br style="font-family: \
courier new,monospace;"> <span style="font-family: courier new,monospace;">     10/5  \
0.000    0.000    0.000    0.000 sre_parse.py:147(getwidth)</span><br \
style="font-family: courier new,monospace;"><span style="font-family: courier \
new,monospace;">        1    0.000    0.000    0.000    0.000 \
sre_parse.py:183(__init__)</span><br style="font-family: courier new,monospace;"> \
<span style="font-family: courier new,monospace;">       39    0.000    0.000    \
0.000    0.000 sre_parse.py:187(_Tokenizer__next)</span><br style="font-family: \
courier new,monospace;"><span style="font-family: courier new,monospace;">        9   \
0.000    0.000    0.000    0.000 sre_parse.py:200(match)</span><br \
style="font-family: courier new,monospace;"> <span style="font-family: courier \
new,monospace;">       37    0.000    0.000    0.000    0.000 \
sre_parse.py:206(get)</span><br style="font-family: courier new,monospace;"><span \
style="font-family: courier new,monospace;">      2/1    0.000    0.000    0.000    \
0.000 sre_parse.py:313(_parse_sub)</span><br style="font-family: courier \
new,monospace;"> <span style="font-family: courier new,monospace;">      2/1    0.000 \
0.000    0.000    0.000 sre_parse.py:368(_parse)</span><br style="font-family: \
courier new,monospace;"><span style="font-family: courier new,monospace;">        1   \
0.000    0.000    0.000    0.000 sre_parse.py:613(parse)</span><br \
style="font-family: courier new,monospace;"> <span style="font-family: courier \
new,monospace;">        1    0.000    0.000    0.000    0.000 \
sre_parse.py:75(__init__)</span><br style="font-family: courier new,monospace;"><span \
style="font-family: courier new,monospace;">        1    0.000    0.000    0.000    \
0.000 sre_parse.py:80(opengroup)</span><br style="font-family: courier \
new,monospace;"> <span style="font-family: courier new,monospace;">        1    0.000 \
0.000    0.000    0.000 sre_parse.py:91(closegroup)</span><br style="font-family: \
courier new,monospace;"><span style="font-family: courier new,monospace;">        6   \
0.000    0.000    0.000    0.000 sre_parse.py:98(__init__)</span><br \
style="font-family: courier new,monospace;"> </font><br>with python:<br><font \
size="1"><span style="font-family: courier new,monospace;">   ncalls  tottime  \
percall  cumtime  percall filename:lineno(function)</span><br style="font-family: \
courier new,monospace;"><span style="font-family: courier new,monospace;">      120   \
0.000    0.000    0.000    0.000 :0(append)</span><br style="font-family: courier \
new,monospace;"> <span style="font-family: courier new,monospace;">        1    0.000 \
0.000    0.000    0.000 :0(compile)</span><br style="font-family: courier \
new,monospace;"><span style="font-family: courier new,monospace;">        1    0.000  \
0.000    0.000    0.000 :0(get)</span><br style="font-family: courier \
new,monospace;"> <span style="font-family: courier new,monospace;">       26    0.000 \
0.000    0.000    0.000 :0(getlower)</span><br style="font-family: courier \
new,monospace;"><span style="font-family: courier new,monospace;">        1    0.000  \
0.000    0.000    0.000 :0(group)</span><br style="font-family: courier \
new,monospace;"> <span style="font-family: courier new,monospace;">       27    0.000 \
0.000    0.000    0.000 :0(isinstance)</span><br style="font-family: courier \
new,monospace;"><span style="font-family: courier new,monospace;">        1    0.000  \
0.000    0.000    0.000 :0(items)</span><br style="font-family: courier \
new,monospace;"> <span style="font-family: courier new,monospace;">  108/104    0.000 \
0.000    0.000    0.000 :0(len)</span><br style="font-family: courier \
new,monospace;"><span style="font-family: courier new,monospace;">       12    0.000  \
0.000    0.000    0.000 :0(min)</span><br style="font-family: courier \
new,monospace;"> <span style="font-family: courier new,monospace;">       26    0.000 \
0.000    0.000    0.000 :0(ord)</span><br style="font-family: courier \
new,monospace;"><span style="font-family: courier new,monospace;">        1    0.000  \
0.000    0.000    0.000 :0(remove)</span><br style="font-family: courier \
new,monospace;"> <span style="font-family: courier new,monospace;">        1    0.739 \
0.739    0.739    0.739 :0(search)</span><br style="font-family: courier \
new,monospace;"><span style="font-family: courier new,monospace;">        1    0.002  \
0.002    0.002    0.002 :0(setprofile)</span><br style="font-family: courier \
new,monospace;"> <span style="font-family: courier new,monospace;">        1    0.000 \
0.000    0.743    0.743 &lt;string&gt;:1(&lt;module&gt;)</span><br \
style="font-family: courier new,monospace;"><span style="font-family: courier \
new,monospace;">        0    0.000             0.000          \
profile:0(profiler)</span><br style="font-family: courier new,monospace;"> <span \
style="font-family: courier new,monospace;">        1    0.000    0.000    0.744    \
0.744 profile:0(testref(homepage_html))</span><br style="font-family: courier \
new,monospace;"><span style="font-family: courier new,monospace;">        1    0.000  \
0.000    0.743    0.743 pvj.py:9(testref)</span><br style="font-family: courier \
new,monospace;"> <span style="font-family: courier new,monospace;">        1    0.000 \
0.000    0.004    0.004 re.py:188(compile)</span><br style="font-family: courier \
new,monospace;"><span style="font-family: courier new,monospace;">        1    0.000  \
0.000    0.004    0.004 re.py:229(_compile)</span><br style="font-family: courier \
new,monospace;"> <span style="font-family: courier new,monospace;">      6/1    0.001 \
0.000    0.001    0.001 sre_compile.py:21(_compile)</span><br style="font-family: \
courier new,monospace;"><span style="font-family: courier new,monospace;">        4   \
0.000    0.000    0.000    0.000 sre_compile.py:296(_simple)</span><br \
style="font-family: courier new,monospace;"> <span style="font-family: courier \
new,monospace;">        1    0.000    0.000    0.000    0.000 \
sre_compile.py:303(_compile_info)</span><br style="font-family: courier \
new,monospace;"><span style="font-family: courier new,monospace;">        1    0.000  \
0.000    0.001    0.001 sre_compile.py:412(_code)</span><br style="font-family: \
courier new,monospace;"> <span style="font-family: courier new,monospace;">        1  \
0.000    0.000    0.004    0.004 sre_compile.py:427(compile)</span><br \
style="font-family: courier new,monospace;"><span style="font-family: courier \
new,monospace;">       16    0.000    0.000    0.000    0.000 \
sre_parse.py:126(__len__)</span><br style="font-family: courier new,monospace;"> \
<span style="font-family: courier new,monospace;">       26    0.000    0.000    \
0.000    0.000 sre_parse.py:130(__getitem__)</span><br style="font-family: courier \
new,monospace;"><span style="font-family: courier new,monospace;">        4    0.000  \
0.000    0.000    0.000 sre_parse.py:134(__setitem__)</span><br style="font-family: \
courier new,monospace;"> <span style="font-family: courier new,monospace;">       31  \
0.000    0.000    0.000    0.000 sre_parse.py:138(append)</span><br \
style="font-family: courier new,monospace;"><span style="font-family: courier \
new,monospace;">     10/5    0.000    0.000    0.000    0.000 \
sre_parse.py:140(getwidth)</span><br style="font-family: courier new,monospace;"> \
<span style="font-family: courier new,monospace;">        1    0.000    0.000    \
0.000    0.000 sre_parse.py:178(__init__)</span><br style="font-family: courier \
new,monospace;"><span style="font-family: courier new,monospace;">       39    0.000  \
0.000    0.001    0.000 sre_parse.py:182(__next)</span><br style="font-family: \
courier new,monospace;"> <span style="font-family: courier new,monospace;">        9  \
0.000    0.000    0.000    0.000 sre_parse.py:195(match)</span><br \
style="font-family: courier new,monospace;"><span style="font-family: courier \
new,monospace;">       37    0.000    0.000    0.001    0.000 \
sre_parse.py:201(get)</span><br style="font-family: courier new,monospace;"> <span \
style="font-family: courier new,monospace;">      2/1    0.000    0.000    0.002    \
0.002 sre_parse.py:301(_parse_sub)</span><br style="font-family: courier \
new,monospace;"><span style="font-family: courier new,monospace;">      2/1    0.001  \
0.000    0.002    0.002 sre_parse.py:379(_parse)</span><br style="font-family: \
courier new,monospace;"> <span style="font-family: courier new,monospace;">        1  \
0.000    0.000    0.002    0.002 sre_parse.py:663(parse)</span><br \
style="font-family: courier new,monospace;"><span style="font-family: courier \
new,monospace;">        1    0.000    0.000    0.000    0.000 \
sre_parse.py:67(__init__)</span><br style="font-family: courier new,monospace;"> \
<span style="font-family: courier new,monospace;">        1    0.000    0.000    \
0.000    0.000 sre_parse.py:72(opengroup)</span><br style="font-family: courier \
new,monospace;"><span style="font-family: courier new,monospace;">        1    0.000  \
0.000    0.000    0.000 sre_parse.py:83(closegroup)</span><br style="font-family: \
courier new,monospace;"> <span style="font-family: courier new,monospace;">        6  \
0.000    0.000    0.000    0.000 sre_parse.py:90(__init__)</span><br \
style="font-family: courier new,monospace;"></font><br>The problem is not related to \
python/jython<br> The problem is really String manipulation in python in \
general.<br><br>The Java implementation is extremely fast and it&#39;s not a matter \
of 500 times but much more !<br><br>Conclusion: i will reuse ONLY java when dealing \
with regex in Jython/Grinder.<br> <br>Many thanks<br>Olivier<br><br><br><div \
class="gmail_quote">2011/11/30 Richard Lynch <span dir="ltr">&lt;<a \
href="mailto:Richard.Lynch@rasmussen.edu" \
target="_blank">Richard.Lynch@rasmussen.edu</a>&gt;</span><br><blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex">

So I was using java.util.regex, but saw an article somewhere about using<br>
Python&#39;s libraries such as &#39;re&#39;.<br>
<br>
Philip Aston&#39;s help (thanks!) made it possible to use Python&#39;s &#39;re, \
but...<br> <br>
At least one of the following statements must be true:<br>
<br>
a) I&#39;m doing something incredibly stupid<br>
b) The Grinder is broken<br>
c) Python&#39;s &#39;re&#39; (called via ginder-&gt;java-&gt;jython.jar) is 500+ \
                *times*<br>
*slower* than Java Pattern<br>
<br>
See my TestRunner and properties and output at:<br>
<a href="http://www.6112northwolcott.com/pvj/" \
target="_blank">http://www.6112northwolcott.com/pvj/</a><br> <br>
Does grinder-&gt;java-&gt;jython-&gt;python really add that much overhead? A \
factor<br> of 500 for the &quot;same&quot; functionality?..<br>
<br>
Or is Python really inherently that much slower than Java?<br>
<br>
NOTE:<br>
I&#39;m a PHP guy, completely out of my element, so a) is the most likely<br>
answer, even if I can&#39;t see what I&#39;ve done stupidly at the moment...<br>
<br>
<br>
------------------------------------------------------------------------------<br>
All the data continuously generated in your IT infrastructure<br>
contains a definitive record of customers, application performance,<br>
security threats, fraudulent activity, and more. Splunk takes this<br>
data and makes sense of it. IT sense. And common sense.<br>
<a href="http://p.sf.net/sfu/splunk-novd2d" \
target="_blank">http://p.sf.net/sfu/splunk-novd2d</a><br> \
_______________________________________________<br> grinder-use mailing list<br>
<a href="mailto:grinder-use@lists.sourceforge.net" \
target="_blank">grinder-use@lists.sourceforge.net</a><br> <a \
href="https://lists.sourceforge.net/lists/listinfo/grinder-use" \
target="_blank">https://lists.sourceforge.net/lists/listinfo/grinder-use</a><br> \
</blockquote></div><br>



------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure 
contains a definitive record of customers, application performance, 
security threats, fraudulent activity, and more. Splunk takes this 
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d

_______________________________________________
grinder-use mailing list
grinder-use@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/grinder-use


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic