[prev in list] [next in list] [prev in thread] [next in thread]
List: python-capi-sig
Subject: [capi-sig]Re: fixedint patch, proof of concept working now
From: Robert Bradshaw <robertwb () math ! washington ! edu>
Date: 2018-09-19 8:48:39
Message-ID: CADiQ+QAirnjUWD8Nk35+N_7REieCPQqJD2ciom+V2knRQPpFZA () mail ! gmail ! com
[Download RAW message or body]
On Wed, Sep 19, 2018 at 10:15 AM INADA Naoki <songofacandy@gmail.com> wrote:
> >
> > I did consider trying to use a second tag for short strings. Not
> > sure it will help too much as some quick analysis shows that only
> > 25% of strings used by PyDict_GetItem are short enough to fit.
> >
> > This morning I dreamed up new idea: analyze normal Python programs
> > and build a list of strings commonly used for PyDict_GetItem. They
> > will be strings like "self",
"self" is almost never referred to by name (it's an indexed local in the
bytecode)
> builtin functions names, etc.
Strings encountered in CPython compilation are already interend.
Fortunately, this should make the experiment easier--you could just plug
into that framework to decide which strings to represent with tagged
pointers rather than having to come up with your own heuristic.
> Then use
> > a tagged pointer to hold these common strings. I.e. a tag to denote
> > a string (or interned symbol, in Lisp speak) and an integer which is
> > the offset into the fixed array of interned strings. The savings
> > would have to come from avoiding the INCREF/DECREF accounting of
> > refcounts on those strings.
Given that branching to avoid the refcounting is more expensive than doing
the refcounting itself, this might be hard, but maybe there's a slight win
if we're already doing this branching for ints, etc. Or if "normal"
refcounts become more expensive than increment/decrements (e.g. truly
threadsafe). Unlike ints, most strings are not ephemeral, and nor can the
"contents" be accessed without a deference, so tagged pointers are less of
a win here.)
Instead of fixed set of strings,
> > perhaps we could make the intern process dynamically allocate the
> > tag IDs. We could have a specialized lookdict that works for dicts
> > containing only interned strings.
>
> It's very nice idea!
>
CPython already optimizes for dicts containing only strings; given that
most of the keys in this dict are interned, I'm not sure it would be a big
win to disallow non-interned keys (which would primarily speed up the
atypical key-not-found path).
Still, these are interesting experiments; thanks for exploring and sharing!
_______________________________________________
capi-sig mailing list -- capi-sig@python.org
To unsubscribe send an email to capi-sig-leave@python.org
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic