[prev in list] [next in list] [prev in thread] [next in thread]
List: ruby-talk
Subject: Re: Count substrings in string, scan too slow
From: brabuhr () gmail ! com
Date: 2010-06-30 2:04:16
Message-ID: AANLkTikQyH2W3V2qzg8xVpcCIRLbU36qJUb5Fuvq0d5H () mail ! gmail ! com
[Download RAW message or body]
On Tue, Jun 29, 2010 at 3:13 PM, Charles Oliver Nutter
<headius@headius.com> wrote:
> On Thu, Jun 24, 2010 at 2:48 PM, <brabuhr@gmail.com> wrote:
>> x.report 'boyer_moore' do
>> TIMES.times do
>> count = BoyerMoore.match("yo", s).size
>> check count
>> end
>> end
>
> FYI, a large part of the overhead here is probably the Java calls,
> which are a bit slower than Ruby to Ruby calls (plus it's decoding the
> "yo" string to UTF-16 each call). For a larger string and fewer calls,
> the pure Java BoyerMoore performance would likely benchmark a lot
> better than this.
I had a similar suspicion and had started a modified benchmark doing
fewer loops over larger data, but had to move on to other things.
This gives me a chance to try out the JRuby Mac Installer...
Original benchmark:
jruby 1.5.0 (ruby 1.8.7 patchlevel 249) (2010-05-12 6769999) (Java
HotSpot(TM) 64-Bit Server VM 1.6.0_20) [x86_64-java]
Rehearsal -----------------------------------------------
scan 8.851000 0.000000 8.851000 ( 8.784000)
scan ++ 14.186000 0.000000 14.186000 ( 14.186000)
scan re 8.594000 0.000000 8.594000 ( 8.594000)
scan re ++ 15.558000 0.000000 15.558000 ( 15.558000)
while 8.102000 0.000000 8.102000 ( 8.101000)
strscan 14.023000 0.000000 14.023000 ( 14.023000)
boyer_moore 7.446000 0.000000 7.446000 ( 7.446000)
------------------------------------- total: 76.760000sec
user system total real
scan 8.157000 0.000000 8.157000 ( 8.157000)
scan ++ 13.953000 0.000000 13.953000 ( 13.953000)
scan re 8.346000 0.000000 8.346000 ( 8.346000)
scan re ++ 15.332000 0.000000 15.332000 ( 15.333000)
while 8.087000 0.000000 8.087000 ( 8.087000)
strscan 14.303000 0.000000 14.303000 ( 14.303000)
boyer_moore 6.885000 0.000000 6.885000 ( 6.885000)
Even with the Ruby to Java call overhead, the Java BoyerMoore is
coming back the fastest on this machine. For comparison:
ruby 1.8.7 (2009-06-12 patchlevel 174) [universal-darwin10.0]
Rehearsal ----------------------------------------------
scan 31.030000 0.020000 31.050000 ( 31.094718)
scan ++ 62.310000 0.900000 63.210000 ( 63.227271)
scan re 31.030000 0.030000 31.060000 ( 31.110528)
scan re ++ 62.820000 0.870000 63.690000 ( 63.718876)
while 26.090000 0.020000 26.110000 ( 26.095308)
strscan 28.440000 0.010000 28.450000 ( 28.485140)
----------------------------------- total: 243.570000sec
user system total real
scan 31.240000 0.020000 31.260000 ( 31.264699)
scan ++ 64.000000 0.860000 64.860000 ( 64.865223)
scan re 31.570000 0.020000 31.590000 ( 31.581045)
scan re ++ 64.180000 0.980000 65.160000 ( 65.401667)
while 26.580000 0.030000 26.610000 ( 26.757658)
strscan 28.730000 0.030000 28.760000 ( 28.831860)
Unfortunately, I do not have 1.9.x on this machine at the moment.
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic