[prev in list] [next in list] [prev in thread] [next in thread]
List: python-ideas
Subject: Re: [Python-ideas] sequence.apply(function)
From: Yuval Greenfield <ubershmekel () gmail ! com>
Date: 2012-09-02 8:50:07
Message-ID: CANSw7KyaNwM9J0iyRASOSRWhejjUDxBB26MX1EUsb4FUZCvS+g () mail ! gmail ! com
[Download RAW message or body]
[Attachment #2 (multipart/alternative)]
On Sun, Sep 2, 2012 at 5:14 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
> On Sun, Sep 2, 2012 at 8:02 AM, Antoine Pitrou <solipsis@pitrou.net>
> wrote:
> > On Sun, 2 Sep 2012 00:55:39 +0300
> > Yuval Greenfield <ubershmekel@gmail.com>
> > wrote:
> >> On Sat, Sep 1, 2012 at 8:06 PM, Guido van Rossum <guido@python.org>
> wrote:
> >>
> >> > It's less Pythonic, because every sequence-like type (not just list)
> >> > would have to reimplement it.
> >> >
> >> > Similar things get proposed for iterators (e.g. it1 + it2, it[:n],
> >> > it[n:]) regularly and they are (and should be) rejected for the same
> >> > reason.
> >> >
> >> >
> >> Python causes some confusion because some things are methods and others
> >> builtins. Is there a PEP or rationale that defines what goes where?
> >
> > When something only applies to a single type or a couple of types, it is
> > a method. When it is generic enough, it is a builtin.
> > Of course there are grey areas but that's the basic idea.
>
> Yes, it comes down to the fact that we are *very* reluctant to impose
> required base classes (I believe the only ones currently enforced
> anywhere are object, BaseException and str - everything else should
> fall back to a protocol method, ABC or interface specific registration
> mechanism. Most interfaces that used to require actual integer objects
> are now using operator.index, or one of its C API equivalents).
>
> In Python, we also actively discourage "reopening" classes to add new
> methods (this is mostly a cultural thing, though - the language
> doesn't actually contain any mechanism to stop you by default,
> although it's possible to add such enforcement via metaclasses)
>
> Thus, protocols are born which define "has this behaviour", rather
> than "is one of these". That's why we have the len() builtin and
> associated __len__() protocol to say "taking the length of this object
> is a meaningful operation" rather than mandatory inheritance from a
> Container class that has a ".len()" method.
>
> They're most obviously beneficial when there are *multiple* protocols
> that can be used to implement a particular behaviour. For example,
> with iter(), the __iter__ protocol is only the first option tried. If
> that fails, then it will instead check for __getitem__ and if that
> exists, return a standard sequence iterator instead. Similarly,
> reversed() checks for __reversed__ first, and then checks for __len__
> and __getitem__, producing a reverse sequence iterator in the latter
> case.
>
> Similarly, next() was moved from a standard method to a builtin
> function in 3.x? Why? Mainly to add the "if not found, return this
> default value" behaviour. That kind of thing is much easier to add
> when the object is only handling a piece of the behaviour, with
> additional standard mechanisms around it (in this case, optionally
> returning a default value when StopIteration is thrown by the
> iterator).
>
> Generators are another good illustration of the principle: For iter()
> and next(), they follow the standard protocol and rely on the
> corresponding builtins. However, g.send() and g.throw() require deep
> integration with the interpreter's eval loop. There's currently no way
> to implement either of those behaviours as an ordinary type, thus
> they're exposed as ordinary methods, since they're genuinely generator
> specific.
>
> As to *why* this is a good thing: procedural APIs encourage low
> coupling. Yes, object oriented programming is a good way to scale an
> application architecture up to more complicated problems. The issue is
> with fetishising OOP to the point where you disallow the creation of
> procedural APIs that hide the OOP details. That approach sets a
> minimum floor to the complexity of your implementations, as even if
> you don't *need* the power of OOP, you're forced to deal with it
> because the language doesn't offer anything else, and that way lies
> Java. There's a reason Java is significantly more popular on large
> enterprise projects than it is in small teams - it takes a certain,
> rather high, level of complexity for the reasons behind any of that
> boilerplate to start to become clear :)
>
> Cheers,
> Nick.
>
>
Thanks, that's some interesting reasoning.
Maybe I'm old fashioned but I like running dir(x) to find out what an
object can do, and the wall of double underscores is hard to read.
Perhaps we could add to the inspect module a "dirprotocols" function which
returns a list of builtins that can be used on an object. I see that the
builtins are listed in e.g. help([]) but on user defined classes it might
be less obvious. Maybe we could just add a dictionary:
inspect.special_methods = {'__len__': len,
'__getitem__': 'x.__getitem__(y) <==> x[y]',
'__iter__': iter,
... }
and then dirprotocols would be easy to implement.
Yuval
[Attachment #5 (text/html)]
<div dir="ltr"><div class="gmail_quote">On Sun, Sep 2, 2012 at 5:14 AM, Nick Coghlan \
<span dir="ltr"><<a href="mailto:ncoghlan@gmail.com" \
target="_blank">ncoghlan@gmail.com</a>></span> wrote:<br><blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex">
<div class="im">On Sun, Sep 2, 2012 at 8:02 AM, Antoine Pitrou <<a \
href="mailto:solipsis@pitrou.net">solipsis@pitrou.net</a>> wrote:<br> > On Sun, \
2 Sep 2012 00:55:39 +0300<br> > Yuval Greenfield <<a \
href="mailto:ubershmekel@gmail.com">ubershmekel@gmail.com</a>><br> > wrote:<br>
>> On Sat, Sep 1, 2012 at 8:06 PM, Guido van Rossum <<a \
href="mailto:guido@python.org">guido@python.org</a>> wrote:<br> >><br>
>> > It's less Pythonic, because every sequence-like type (not just \
list)<br> >> > would have to reimplement it.<br>
>> ><br>
>> > Similar things get proposed for iterators (e.g. it1 + it2, it[:n],<br>
>> > it[n:]) regularly and they are (and should be) rejected for the \
same<br> >> > reason.<br>
>> ><br>
>> ><br>
>> Python causes some confusion because some things are methods and others<br>
>> builtins. Is there a PEP or rationale that defines what goes where?<br>
><br>
> When something only applies to a single type or a couple of types, it is<br>
> a method. When it is generic enough, it is a builtin.<br>
> Of course there are grey areas but that's the basic idea.<br>
<br>
</div>Yes, it comes down to the fact that we are *very* reluctant to impose<br>
required base classes (I believe the only ones currently enforced<br>
anywhere are object, BaseException and str - everything else should<br>
fall back to a protocol method, ABC or interface specific registration<br>
mechanism. Most interfaces that used to require actual integer objects<br>
are now using operator.index, or one of its C API equivalents).<br>
<br>
In Python, we also actively discourage "reopening" classes to add new<br>
methods (this is mostly a cultural thing, though - the language<br>
doesn't actually contain any mechanism to stop you by default,<br>
although it's possible to add such enforcement via metaclasses)<br>
<br>
Thus, protocols are born which define "has this behaviour", rather<br>
than "is one of these". That's why we have the len() builtin and<br>
associated __len__() protocol to say "taking the length of this object<br>
is a meaningful operation" rather than mandatory inheritance from a<br>
Container class that has a ".len()" method.<br>
<br>
They're most obviously beneficial when there are *multiple* protocols<br>
that can be used to implement a particular behaviour. For example,<br>
with iter(), the __iter__ protocol is only the first option tried. If<br>
that fails, then it will instead check for __getitem__ and if that<br>
exists, return a standard sequence iterator instead. Similarly,<br>
reversed() checks for __reversed__ first, and then checks for __len__<br>
and __getitem__, producing a reverse sequence iterator in the latter<br>
case.<br>
<br>
Similarly, next() was moved from a standard method to a builtin<br>
function in 3.x? Why? Mainly to add the "if not found, return this<br>
default value" behaviour. That kind of thing is much easier to add<br>
when the object is only handling a piece of the behaviour, with<br>
additional standard mechanisms around it (in this case, optionally<br>
returning a default value when StopIteration is thrown by the<br>
iterator).<br>
<br>
Generators are another good illustration of the principle: For iter()<br>
and next(), they follow the standard protocol and rely on the<br>
corresponding builtins. However, g.send() and g.throw() require deep<br>
integration with the interpreter's eval loop. There's currently no way<br>
to implement either of those behaviours as an ordinary type, thus<br>
they're exposed as ordinary methods, since they're genuinely generator<br>
specific.<br>
<br>
As to *why* this is a good thing: procedural APIs encourage low<br>
coupling. Yes, object oriented programming is a good way to scale an<br>
application architecture up to more complicated problems. The issue is<br>
with fetishising OOP to the point where you disallow the creation of<br>
procedural APIs that hide the OOP details. That approach sets a<br>
minimum floor to the complexity of your implementations, as even if<br>
you don't *need* the power of OOP, you're forced to deal with it<br>
because the language doesn't offer anything else, and that way lies<br>
Java. There's a reason Java is significantly more popular on large<br>
enterprise projects than it is in small teams - it takes a certain,<br>
rather high, level of complexity for the reasons behind any of that<br>
boilerplate to start to become clear :)<br>
<br>
Cheers,<br>
Nick.<br>
<span class="HOEnZb"><font \
color="#888888"><br></font></span></blockquote><div><br></div><div>Thanks, that's \
some interesting reasoning.</div><div><br></div><div>Maybe I'm old fashioned but \
I like running dir(x) to find out what an object can do, and the wall of double \
underscores is hard to read.</div>
<div><br></div><div>Perhaps we could add to the inspect module a \
"dirprotocols" function which returns a list of builtins that can be used \
on an object. I see that the builtins are listed in e.g. help([]) but on user defined \
classes it might be less obvious. Maybe we could just add a dictionary:</div>
<div><br></div><div>inspect.special_methods = {'__len__': len,</div><div> \
'__getitem__': 'x.__getitem__(y) <==> x[y]',</div><div> \
'__iter__': iter,</div><div> ... }</div><div><br></div>
<div>and then dirprotocols would be easy to \
implement.</div><div><br></div><div><br></div><div>Yuval</div><div> \
</div></div></div>
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
http://mail.python.org/mailman/listinfo/python-ideas
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic