[prev in list] [next in list] [prev in thread] [next in thread]
List: r-help
Subject: Re: [R] aggregating data and missing values
From: Gabor Grothendieck <ggrothendieck () gmail ! com>
Date: 2005-11-02 13:34:01
Message-ID: 971536df0511020534q372a90b5n9c5941a2b3444fdf () mail ! gmail ! com
[Download RAW message or body]
On 11/2/05, Pascal A. Niklaus <Pascal.Niklaus@unibas.ch> wrote:
> Hi all,
>
> I would like to aggregate a large data file that is defined by a number of
> factors and associated values. The point is that not all factor level
> combinations are present in the data file -- these "missing" values are in
> fact to be treated as zeroes.
>
> Is there a straightforward way to
> a) either expand the existing data set so that the missing factor combinations
> can be added, or
> b) an "aggregate" function that generates a row of data for all given factor
> combinations.
>
> Here is an example:
>
> a) "complete" data set:
>
> > example <-
> data.frame(f1=factor(rep(LETTERS[1:3],each=4)),f2=factor(letters[1:2]),d=1:12)
> > aggregate(cbind(d=example$d),by=list(f1=example$f1,f2=example$f2),sum)
> f1 f2 d
> 1 A a 4
> 2 B a 12
> 3 C a 20
> 4 A b 6
> 5 B b 14
> 6 C b 22
>
> b) data set with "missing combinations":
>
> > example2 <- example[c(-10,-12),]
> > aggregate(cbind(d=example2$d),by=list(f1=example2$f1,f2=example2$f2),sum)
> f1 f2 d
> 1 A a 4
> 2 B a 12
> 3 C a 20
> 4 A b 6
> 5 B b 14
>
> Here, I would like to have the missing row width f1=C, f2=b, d=NA.
Suppose the result of the aggregate just shown is example2.ag . Then
merge(example2.ag, expand.grid(f1 = LETTERS[1:3], f2 = letters[1:2]),
all = TRUE)
______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic