'Re: [pypy-dev] deepcopy slower in PyPY ?!'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       pypy-dev
Subject:    Re: [pypy-dev] deepcopy slower in PyPY ?!
From:       David Fraser <davidf () sjsoft ! com>
Date:       2011-09-08 7:57:54
Message-ID: 62768815-4ddc-43c4-9b3b-2d375dccf5a5 () jackdaw ! local
[Download RAW message or body]

On Wednesday, September 7, 2011 at 10:38:10 PM, Maciej Fijalkowski <fijall@=
gmail.com> wrote:
> On Wed, Sep 7, 2011 at 8:16 PM, Antonio Cuni <anto.cuni@gmail.com>
> wrote:
> > Hi Jorge,
> >
> > On 07/09/11 16:43, Jorge de Jesus wrote:
> >>
> >> =C2=A0Hi to all
> >>
> >> I've benchmark/profile =C2=A0some code (PyWPS API) and PyPy-c is 2/3x
> >> times
> >> slower than CPython. =C2=A0This was done in a virtual machine using
> >> x86_64
> >>
> >> The code being benchmark spends most of the time making calls to
> >> copy/deepcopy. =C2=A0I've found that this was an issue in PyPy 1.6
> >> (https://bugs.pypy.org/issue767), but the issue has been closed.
> >> So I've
> >> downloaded the latest dev version but PyPy-c continues to be slow
> >> compared to CPython.
> >
> > could you please send us a benchmark which showcases the problem?
> > The
> > smaller the better, ideally a benchmark which is contained in a
> > single file
> > is easier to run and debug than one which involves to download lots
> > of code
> > from the internet.
>
> the internet is not the problem here ;-)

So here's my benchmark of doing a copy.deepcopy of the internet - or at lea=
st, of the ipv4 address space... (unfortunately it needs to download that, =
but caches if possible, and doesn't time that)

In this case it's only testing copying nested xml elementtree nodes, and so=
me basic dicts. It actually shows a remarkable improvement in pypy; here ar=
e the average speeds per copy for 100 and 1000 repeats (showing how the JIT=
 kicks in in pypy)

executable      repeats etree   dicts
cpython2.6      100     37.17   3.98
cpython2.6      1000    36.42   3.97
cpython2.7      100     58.10   4.38
cpython2.7      1000    57.29   4.06
cpython3.2      100     57.41   3.61
cpython3.2      1000    56.98   3.68
pypy1.5.0       100     32.08   1.34
pypy1.5.0       1000    25.54   1.11
pypy1.6.0       100     25.89   1.17
pypy1.6.0       1000    16.32   0.81

So, pypy can even speed up copying the internet :)

Cheers
David
["benchmark_internet.py" (text/x-python)]

#!/usr/bin/env python

import os
try:
    from urllib2 import urlopen
except ImportError:
    from urllib.request import urlopen
from xml.etree import ElementTree
import copy
import timeit
import sys

print("internet copying benchmark: testing %s" % sys.version)

def get_url(url, cache_filename):
    if os.path.exists(cache_filename):
        print("using cached data")
        return open(cache_filename, "rb").read()
    else:
        print("downloading data")
        data = urlopen(url).read()
        print("caching data")
        open(cache_filename, "wb").write(data)
        return data

ipv4_address_space_source = \
get_url("http://www.iana.org/assignments/ipv4-address-space/ipv4-address-space.xml", \
"ipv4-address-space.xml") print("parsing data")
ipv4_address_space_etree = \
ElementTree.fromstring(ipv4_address_space_source) ipv4_address_space_dicts \
= [dict((child.tag, child.text) for child in record.getchildren()) for \
record in ipv4_address_space_etree.findall("{http://www.iana.org/assignments}record")]

# put this somewhere accessible
copy.ipv4_address_space_etree = ipv4_address_space_etree
copy.ipv4_address_space_dicts = ipv4_address_space_dicts

repeats = 1000 if len(sys.argv) < 2 else int(sys.argv[1])
print("copying data (timing %d repeats)" % repeats)
etree_speed = timeit.timeit('copy.deepcopy(e)', 'import copy; e = \
copy.ipv4_address_space_etree', number=repeats)*1000/repeats print("etree: \
%0.2f ms/copy" % etree_speed) dicts_speed = \
timeit.timeit('copy.deepcopy(d)', 'import copy; d = \
copy.ipv4_address_space_dicts', number=repeats)*1000/repeats print("dicts: \
%0.2f ms/copy" % dicts_speed)

with open("results.txt", "a") as f:
    f.write("%s\t%d\t%0.2f\t%0.2f\n" % (sys.executable, repeats, \
etree_speed, dicts_speed))

_______________________________________________
pypy-dev mailing list
pypy-dev@python.org
http://mail.python.org/mailman/listinfo/pypy-dev

[prev in list] [next in list] [prev in thread] [next in thread]