[prev in list] [next in list] [prev in thread] [next in thread] 

List:       r-help
Subject:    Re: [R] aggregating data and missing values
From:       Gabor Grothendieck <ggrothendieck () gmail ! com>
Date:       2005-11-02 13:34:01
Message-ID: 971536df0511020534q372a90b5n9c5941a2b3444fdf () mail ! gmail ! com
[Download RAW message or body]

On 11/2/05, Pascal A. Niklaus <Pascal.Niklaus@unibas.ch> wrote:
> Hi all,
>
> I would like to aggregate a large data file that is defined by a number of
> factors and associated values. The point is that not all factor level
> combinations are present in the data file  -- these "missing" values are in
> fact to be treated as zeroes.
>
> Is there a straightforward way to
> a) either expand the existing data set so that the missing factor combinations
> can be added, or
> b) an "aggregate" function that generates a row of data for all given factor
> combinations.
>
> Here is an example:
>
> a) "complete" data set:
>
> > example <-
> data.frame(f1=factor(rep(LETTERS[1:3],each=4)),f2=factor(letters[1:2]),d=1:12)
> > aggregate(cbind(d=example$d),by=list(f1=example$f1,f2=example$f2),sum)
>  f1 f2  d
> 1  A  a  4
> 2  B  a 12
> 3  C  a 20
> 4  A  b  6
> 5  B  b 14
> 6  C  b 22
>
> b) data set with "missing combinations":
>
> > example2 <- example[c(-10,-12),]
> > aggregate(cbind(d=example2$d),by=list(f1=example2$f1,f2=example2$f2),sum)
>  f1 f2  d
> 1  A  a  4
> 2  B  a 12
> 3  C  a 20
> 4  A  b  6
> 5  B  b 14
>
> Here, I would like to have the missing row width f1=C, f2=b, d=NA.

Suppose the result of the aggregate just shown is example2.ag .  Then

merge(example2.ag, expand.grid(f1 = LETTERS[1:3], f2 = letters[1:2]),
all = TRUE)

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic