[prev in list] [next in list] [prev in thread] [next in thread] 

List:       r-help
Subject:    [R] aggregate.data.frame - prevent conversion to factors? show
From:       Thomas Pujol <thomas.pujol () yahoo ! com>
Date:       2007-07-31 20:18:32
Message-ID: 574927.90748.qm () web59315 ! mail ! re1 ! yahoo ! com
[Download RAW message or body]

I have a two question regarding the "aggregate.data.frame" method of the "aggregate" \
function.

My situation:

a. My "x" variable is a data.frame ("mydf") with two columns, both columns of \
type/format "numeric".

b. My "by" variable is a data.frame("mybys") with two columns, both columns of \
type/format "character".

c. Some of the values contained in "mybys" are originally "NA".

Prior to submitting the by variables to the aggregate function, I convert the NA \
values to the text-string "is_na". ( I do this because I want to understand the \
statistics of variables where their "by" value is NA, and want this information in \
the results of the aggregate function.)

My questions:

1. Is there a "better" way, (other then converting NA's to some text-string), to see \
the "statistics" ("mean", etc.) of the variables where the by is "NA"? (i.e to have \
them included within the results of the aggregate function)

2. When I run the aggregate function, the two column that contain the "by" variables \
are always formatted as "factors".  Is there a way to prevent this, and to instead \
have them retain the format in the original "mybys" data.frame (i.e to have them come \
back formatted as "character"?  Or do I just need to re-format them once I have my \
results?



mydf=data.frame(testvar1=c(1,3,5,7,8,3,5,NA,4,5,7,9), \
testvar2=c(11,33,55,77,88,33,55,NA,44,55,77,99) ) str(mydf)
#

myby1=c('red','blue',1,2,NA,'big',1,2,'red',1,NA,12) 
myby2=c('wet','dry',99,95,NA,'damp',95,99,'red',99,NA,NA) 

myby1.new = ifelse(is.na(myby1)==T,"is_na",myby1)
myby2.new = ifelse(is.na(myby2)==T,"is_na",myby2)
str(myby1.new)
str(myby2.new)

mybys=data.frame(mbn1=myby1.new,mbn2=myby2.new , stringsAsFactors =F)
str(mybys)


#
myagg1 = aggregate(x=mydf, by=mybys, FUN='mean')
str(myagg1)


myagg2 = myagg1
myagg2[1:ncol(mybys)] = as.character(unlist(myagg1[1:ncol(mybys)]))
str(myagg2)

myagg1
myagg2

       
---------------------------------

	[[alternative HTML version deleted]]

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic