[prev in list] [next in list] [prev in thread] [next in thread] 

List:       r-devel
Subject:    Re: [Rd] median and data frames
From:       Tim Hesterberg <timhesterberg () gmail ! com>
Date:       2011-04-30 15:19:31
Message-ID: yajfei4j6a2k.fsf () gmail ! com
[Download RAW message or body]

I also favor deprecating mean.data.frame.

One possible exception would be for a single-column data frame.
But even here I'd say no, lest people expect the same behavior for
median, var, ...

Pat's suggestion of using stop() would work nicely for mean.
(but omit paste - stop handles that).

Tim Hesterberg

>If Martin's proposal is accepted, does
>that mean that the median method for
>data frames would be something like:
>
>function (x, ...)
>{
>         stop(paste("you probably mean to use the command: sapply(",
>                 deparse(substitute(x)), ", median)", sep=""))
>}
>
>Pat
>
>
>On 29/04/2011 15:25, Martin Maechler wrote:
>>>>>>> Paul Johnson<pauljohn32@gmail.com>
>>>>>>>      on Thu, 28 Apr 2011 00:20:27 -0500 writes:
>>
>>      >  On Wed, Apr 27, 2011 at 12:44 PM, Patrick Burns
>>      >  <pburns@pburns.seanet.com>  wrote:
>>      >>  Here are some data frames:
>>      >>
>>      >>  df3.2<- data.frame(1:3, 7:9)
>>      >>  df4.2<- data.frame(1:4, 7:10)
>>      >>  df3.3<- data.frame(1:3, 7:9, 10:12)
>>      >>  df4.3<- data.frame(1:4, 7:10, 10:13)
>>      >>  df3.4<- data.frame(1:3, 7:9, 10:12, 15:17)
>>      >>  df4.4<- data.frame(1:4, 7:10, 10:13, 15:18)
>>      >>
>>      >>  Now here are some commands and their answers:
>>
>>      >>>  median(df4.4)
>>      >>  [1]  8.5 11.5
>>      >>>  median(df3.2[c(1,2,3),])
>>      >>  [1] 2 8
>>      >>>  median(df3.2[c(1,3,2),])
>>      >>  [1]  2 NA
>>      >>  Warning message:
>>      >>  In mean.default(X[[2L]], ...) :
>>      >>    argument is not numeric or logical: returning NA
>>      >>
>>      >>
>>      >>
>>      >>  The sessionInfo is below, but it looks
>>      >>  to me like the present behavior started
>>      >>  in 2.10.0.
>>      >>
>>      >>  Sometimes it gets the right answer.  I'd
>>      >>  be grateful to hear how it does that -- I
>>      >>  can't figure it out.
>>      >>
>>
>>      >  Hello, Pat.
>>
>>      >  Nice poetry there!  I think I have an actual answer, as opposed to the
>>      >  usual crap I spew.
>>
>>      >  I would agree if you said median.data.frame ought to be written to
>>      >  work columnwise, similar to mean.data.frame.
>>
>>      >  apply and sapply  always give the correct answer
>>
>>      >>  apply(df3.3, 2, median)
>>      >  X1.3   X7.9 X10.12
>>      >  2      8     11
>>
>>      [...........]
>>
>> exactly
>>
>>      >  mean.data.frame is now implemented as
>>
>>      >  mean.data.frame<- function(x, ...) sapply(x, mean, ...)
>>
>> exactly.
>>
>> My personal oppinion is that  mean.data.frame() should never have
>> been written.
>> People should know, or learn, to use apply functions for such a
>> task.
>>
>> The unfortunate fact that mean.data.frame() exists makes people
>> think that median.data.frame() should too,
>> and then
>>
>>    var.data.frame()
>>     sd.data.frame()
>>    mad.data.frame()
>>    min.data.frame()
>>    max.data.frame()
>>    ...
>>    ...
>>
>> all just in order to *not* to have to know  sapply()
>> ????
>>
>> No, rather not.
>>
>> My vote is for deprecating  mean.data.frame().
>>
>> Martin
>>
>
>--
>Patrick Burns
>pburns@pburns.seanet.com
>twitter: @portfolioprobe
>http://www.portfolioprobe.com/blog
>http://www.burns-stat.com
>(home of 'Some hints for the R beginner'
>and 'The R Inferno')

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic