[prev in list] [next in list] [prev in thread] [next in thread] 

List:       ruby-talk
Subject:    Re: Count substrings in string, scan too slow
From:       brabuhr () gmail ! com
Date:       2010-06-30 2:04:16
Message-ID: AANLkTikQyH2W3V2qzg8xVpcCIRLbU36qJUb5Fuvq0d5H () mail ! gmail ! com
[Download RAW message or body]

On Tue, Jun 29, 2010 at 3:13 PM, Charles Oliver Nutter
<headius@headius.com> wrote:
> On Thu, Jun 24, 2010 at 2:48 PM,  <brabuhr@gmail.com> wrote:
>>  x.report 'boyer_moore' do
>>    TIMES.times do
>>      count = BoyerMoore.match("yo", s).size
>>      check count
>>    end
>>  end
>
> FYI, a large part of the overhead here is probably the Java calls,
> which are a bit slower than Ruby to Ruby calls (plus it's decoding the
> "yo" string to UTF-16 each call). For a larger string and fewer calls,
> the pure Java BoyerMoore performance would likely benchmark a lot
> better than this.

I had a similar suspicion and had started a modified benchmark doing
fewer loops over larger data, but had to move on to other things.

This gives me a chance to try out the JRuby Mac Installer...

Original benchmark:

jruby 1.5.0 (ruby 1.8.7 patchlevel 249) (2010-05-12 6769999) (Java
HotSpot(TM) 64-Bit Server VM 1.6.0_20) [x86_64-java]
Rehearsal -----------------------------------------------
scan          8.851000   0.000000   8.851000 (  8.784000)
scan ++      14.186000   0.000000  14.186000 ( 14.186000)
scan re       8.594000   0.000000   8.594000 (  8.594000)
scan re ++   15.558000   0.000000  15.558000 ( 15.558000)
while         8.102000   0.000000   8.102000 (  8.101000)
strscan      14.023000   0.000000  14.023000 ( 14.023000)
boyer_moore   7.446000   0.000000   7.446000 (  7.446000)
------------------------------------- total: 76.760000sec

                  user     system      total        real
scan          8.157000   0.000000   8.157000 (  8.157000)
scan ++      13.953000   0.000000  13.953000 ( 13.953000)
scan re       8.346000   0.000000   8.346000 (  8.346000)
scan re ++   15.332000   0.000000  15.332000 ( 15.333000)
while         8.087000   0.000000   8.087000 (  8.087000)
strscan      14.303000   0.000000  14.303000 ( 14.303000)
boyer_moore   6.885000   0.000000   6.885000 (  6.885000)

Even with the Ruby to Java call overhead, the Java BoyerMoore is
coming back the fastest on this machine.  For comparison:

ruby 1.8.7 (2009-06-12 patchlevel 174) [universal-darwin10.0]
Rehearsal ----------------------------------------------
scan        31.030000   0.020000  31.050000 ( 31.094718)
scan ++     62.310000   0.900000  63.210000 ( 63.227271)
scan re     31.030000   0.030000  31.060000 ( 31.110528)
scan re ++  62.820000   0.870000  63.690000 ( 63.718876)
while       26.090000   0.020000  26.110000 ( 26.095308)
strscan     28.440000   0.010000  28.450000 ( 28.485140)
----------------------------------- total: 243.570000sec

                 user     system      total        real
scan        31.240000   0.020000  31.260000 ( 31.264699)
scan ++     64.000000   0.860000  64.860000 ( 64.865223)
scan re     31.570000   0.020000  31.590000 ( 31.581045)
scan re ++  64.180000   0.980000  65.160000 ( 65.401667)
while       26.580000   0.030000  26.610000 ( 26.757658)
strscan     28.730000   0.030000  28.760000 ( 28.831860)

Unfortunately, I do not have 1.9.x on this machine at the moment.


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic