[prev in list] [next in list] [prev in thread] [next in thread] 

List:       python-ideas
Subject:    Re: [Python-ideas] sequence.apply(function)
From:       Yuval Greenfield <ubershmekel () gmail ! com>
Date:       2012-09-02 8:50:07
Message-ID: CANSw7KyaNwM9J0iyRASOSRWhejjUDxBB26MX1EUsb4FUZCvS+g () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


On Sun, Sep 2, 2012 at 5:14 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:

> On Sun, Sep 2, 2012 at 8:02 AM, Antoine Pitrou <solipsis@pitrou.net>
> wrote:
> > On Sun, 2 Sep 2012 00:55:39 +0300
> > Yuval Greenfield <ubershmekel@gmail.com>
> > wrote:
> >> On Sat, Sep 1, 2012 at 8:06 PM, Guido van Rossum <guido@python.org>
> wrote:
> >>
> >> > It's less Pythonic, because every sequence-like type (not just list)
> >> > would have to reimplement it.
> >> >
> >> > Similar things get proposed for iterators (e.g. it1 + it2, it[:n],
> >> > it[n:]) regularly and they are (and should be) rejected for the same
> >> > reason.
> >> >
> >> >
> >> Python causes some confusion because some things are methods and others
> >> builtins. Is there a PEP or rationale that defines what goes where?
> >
> > When something only applies to a single type or a couple of types, it is
> > a method. When it is generic enough, it is a builtin.
> > Of course there are grey areas but that's the basic idea.
>
> Yes, it comes down to the fact that we are *very* reluctant to impose
> required base classes (I believe the only ones currently enforced
> anywhere are object, BaseException and str - everything else should
> fall back to a protocol method, ABC or interface specific registration
> mechanism. Most interfaces that used to require actual integer objects
> are now using operator.index, or one of its C API equivalents).
>
> In Python, we also actively discourage "reopening" classes to add new
> methods (this is mostly a cultural thing, though - the language
> doesn't actually contain any mechanism to stop you by default,
> although it's possible to add such enforcement via metaclasses)
>
> Thus, protocols are born which define "has this behaviour", rather
> than "is one of these". That's why we have the len() builtin and
> associated __len__() protocol to say "taking the length of this object
> is a meaningful operation" rather than mandatory inheritance from a
> Container class that has a ".len()" method.
>
> They're most obviously beneficial when there are *multiple* protocols
> that can be used to implement a particular behaviour. For example,
> with iter(), the __iter__ protocol is only the first option tried. If
> that fails, then it will instead check for __getitem__ and if that
> exists, return a standard sequence iterator instead. Similarly,
> reversed() checks for __reversed__ first, and then checks for __len__
> and __getitem__, producing a reverse sequence iterator in the latter
> case.
>
> Similarly, next() was moved from a standard method to a builtin
> function in 3.x? Why? Mainly to add the "if not found, return this
> default value" behaviour. That kind of thing is much easier to add
> when the object is only handling a piece of the behaviour, with
> additional standard mechanisms around it (in this case, optionally
> returning a default value when StopIteration is thrown by the
> iterator).
>
> Generators are another good illustration of the principle: For iter()
> and next(), they follow the standard protocol and rely on the
> corresponding builtins. However, g.send() and g.throw() require deep
> integration with the interpreter's eval loop. There's currently no way
> to implement either of those behaviours as an ordinary type, thus
> they're exposed as ordinary methods, since they're genuinely generator
> specific.
>
> As to *why* this is a good thing: procedural APIs encourage low
> coupling. Yes, object oriented programming is a good way to scale an
> application architecture up to more complicated problems. The issue is
> with fetishising OOP to the point where you disallow the creation of
> procedural APIs that hide the OOP details. That approach sets a
> minimum floor to the complexity of your implementations, as even if
> you don't *need* the power of OOP, you're forced to deal with it
> because the language doesn't offer anything else, and that way lies
> Java. There's a reason Java is significantly more popular on large
> enterprise projects than it is in small teams - it takes a certain,
> rather high, level of complexity for the reasons behind any of that
> boilerplate to start to become clear :)
>
> Cheers,
> Nick.
>
>
Thanks, that's some interesting reasoning.

Maybe I'm old fashioned but I like running dir(x) to find out what an
object can do, and the wall of double underscores is hard to read.

Perhaps we could add to the inspect module a "dirprotocols" function which
returns a list of builtins that can be used on an object. I see that the
builtins are listed in e.g. help([]) but on user defined classes it might
be less obvious. Maybe we could just add a dictionary:

inspect.special_methods = {'__len__': len,
 '__getitem__': 'x.__getitem__(y) <==> x[y]',
 '__iter__': iter,
 ... }

and then dirprotocols would be easy to implement.


Yuval

[Attachment #5 (text/html)]

<div dir="ltr"><div class="gmail_quote">On Sun, Sep 2, 2012 at 5:14 AM, Nick Coghlan \
<span dir="ltr">&lt;<a href="mailto:ncoghlan@gmail.com" \
target="_blank">ncoghlan@gmail.com</a>&gt;</span> wrote:<br><blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex">

<div class="im">On Sun, Sep 2, 2012 at 8:02 AM, Antoine Pitrou &lt;<a \
href="mailto:solipsis@pitrou.net">solipsis@pitrou.net</a>&gt; wrote:<br> &gt; On Sun, \
2 Sep 2012 00:55:39 +0300<br> &gt; Yuval Greenfield &lt;<a \
href="mailto:ubershmekel@gmail.com">ubershmekel@gmail.com</a>&gt;<br> &gt; wrote:<br>
&gt;&gt; On Sat, Sep 1, 2012 at 8:06 PM, Guido van Rossum &lt;<a \
href="mailto:guido@python.org">guido@python.org</a>&gt; wrote:<br> &gt;&gt;<br>
&gt;&gt; &gt; It&#39;s less Pythonic, because every sequence-like type (not just \
list)<br> &gt;&gt; &gt; would have to reimplement it.<br>
&gt;&gt; &gt;<br>
&gt;&gt; &gt; Similar things get proposed for iterators (e.g. it1 + it2, it[:n],<br>
&gt;&gt; &gt; it[n:]) regularly and they are (and should be) rejected for the \
same<br> &gt;&gt; &gt; reason.<br>
&gt;&gt; &gt;<br>
&gt;&gt; &gt;<br>
&gt;&gt; Python causes some confusion because some things are methods and others<br>
&gt;&gt; builtins. Is there a PEP or rationale that defines what goes where?<br>
&gt;<br>
&gt; When something only applies to a single type or a couple of types, it is<br>
&gt; a method. When it is generic enough, it is a builtin.<br>
&gt; Of course there are grey areas but that&#39;s the basic idea.<br>
<br>
</div>Yes, it comes down to the fact that we are *very* reluctant to impose<br>
required base classes (I believe the only ones currently enforced<br>
anywhere are object, BaseException and str - everything else should<br>
fall back to a protocol method, ABC or interface specific registration<br>
mechanism. Most interfaces that used to require actual integer objects<br>
are now using operator.index, or one of its C API equivalents).<br>
<br>
In Python, we also actively discourage &quot;reopening&quot; classes to add new<br>
methods (this is mostly a cultural thing, though - the language<br>
doesn&#39;t actually contain any mechanism to stop you by default,<br>
although it&#39;s possible to add such enforcement via metaclasses)<br>
<br>
Thus, protocols are born which define &quot;has this behaviour&quot;, rather<br>
than &quot;is one of these&quot;. That&#39;s why we have the len() builtin and<br>
associated __len__() protocol to say &quot;taking the length of this object<br>
is a meaningful operation&quot; rather than mandatory inheritance from a<br>
Container class that has a &quot;.len()&quot; method.<br>
<br>
They&#39;re most obviously beneficial when there are *multiple* protocols<br>
that can be used to implement a particular behaviour. For example,<br>
with iter(), the __iter__ protocol is only the first option tried. If<br>
that fails, then it will instead check for __getitem__ and if that<br>
exists, return a standard sequence iterator instead. Similarly,<br>
reversed() checks for __reversed__ first, and then checks for __len__<br>
and __getitem__, producing a reverse sequence iterator in the latter<br>
case.<br>
<br>
Similarly, next() was moved from a standard method to a builtin<br>
function in 3.x? Why? Mainly to add the &quot;if not found, return this<br>
default value&quot; behaviour. That kind of thing is much easier to add<br>
when the object is only handling a piece of the behaviour, with<br>
additional standard mechanisms around it (in this case, optionally<br>
returning a default value when StopIteration is thrown by the<br>
iterator).<br>
<br>
Generators are another good illustration of the principle: For iter()<br>
and next(), they follow the standard protocol and rely on the<br>
corresponding builtins. However, g.send() and g.throw() require deep<br>
integration with the interpreter&#39;s eval loop. There&#39;s currently no way<br>
to implement either of those behaviours as an ordinary type, thus<br>
they&#39;re exposed as ordinary methods, since they&#39;re genuinely generator<br>
specific.<br>
<br>
As to *why* this is a good thing: procedural APIs encourage low<br>
coupling. Yes, object oriented programming is a good way to scale an<br>
application architecture up to more complicated problems. The issue is<br>
with fetishising OOP to the point where you disallow the creation of<br>
procedural APIs that hide the OOP details. That approach sets a<br>
minimum floor to the complexity of your implementations, as even if<br>
you don&#39;t *need* the power of OOP, you&#39;re forced to deal with it<br>
because the language doesn&#39;t offer anything else, and that way lies<br>
Java. There&#39;s a reason Java is significantly more popular on large<br>
enterprise projects than it is in small teams - it takes a certain,<br>
rather high, level of complexity for the reasons behind any of that<br>
boilerplate to start to become clear :)<br>
<br>
Cheers,<br>
Nick.<br>
<span class="HOEnZb"><font \
color="#888888"><br></font></span></blockquote><div><br></div><div>Thanks, that&#39;s \
some interesting reasoning.</div><div><br></div><div>Maybe I&#39;m old fashioned but \
I like running dir(x) to find out what an object can do, and the wall of double \
underscores is hard to read.</div>

<div><br></div><div>Perhaps we could add to the inspect module a \
&quot;dirprotocols&quot; function which returns a list of builtins that can be used \
on an object. I see that the builtins are listed in e.g. help([]) but on user defined \
classes it might be less obvious. Maybe we could just add a dictionary:</div>

<div><br></div><div>inspect.special_methods = {&#39;__len__&#39;: len,</div><div> \
&#39;__getitem__&#39;: &#39;x.__getitem__(y) &lt;==&gt; x[y]&#39;,</div><div> \
&#39;__iter__&#39;: iter,</div><div> ... }</div><div><br></div>

<div>and then dirprotocols would be easy to \
implement.</div><div><br></div><div><br></div><div>Yuval</div><div> \
</div></div></div>



_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
http://mail.python.org/mailman/listinfo/python-ideas


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic