'Re: [BioC] tweeDEseq glm reduced model specification?'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       bioconductor
Subject:    Re: [BioC] tweeDEseq glm reduced model specification?
From:       "Gonzalez Ruiz, Juan Ramon" <jrgonzalez () creal ! cat>
Date:       2013-08-29 8:51:02
Message-ID: 1304158010.172921.1377766262187.JavaMail.zimbra () creal ! cat
[Download RAW message or body]

Dear Ryan, 

Thank you very much for posting this interesting question. 

As you are pointing out, current version of function tweeDEglm only supports the \
comparison 

count ~ 1 

vs 

count ~ var1 

where var1 encodes for a grouping variable (e.g., cases vs control) 

Few changes have to be performed in order to address the comparison you are proposing \

count ~ var1*var2 

count ~ var1 + var2 

We plan to add this new feature by adding two new arguments in the function tweeDEglm \
(modelNull and modelTest). 

HOWEVER, you can do such comparison using using glmPT function 

(in fact tweeDEglm calls this function gene by gene ans uses lapply to improve time \
computation - the idea of encapsulating this repeated process in a function is also \
due to the fact that some requirements have to be full-filled in order to be able to \
compare both models. Please, see my comment at the end of this message) 

this is an example for analyzing 1 gene (count) for the model you are interested in: 

library(tweeDEseqCountData) 
data(pickrell1) 
counts <- exprs(pickrell1.eset) 
count <- counts[1,] 

# simulate two variables for illustrating purposes 
n <- ncol(counts) 
var1 <- sample(c("A", "B"), n, rep=TRUE) 
var2 <- as.factor(sample(c("A", "B"), n, rep=TRUE)) 
df <- as.factor(data.frame(var1=var1, var2=var2)) 

mod0 <- glmPT(count ~ var1 + var2, df) 
mod1 <- glmPT(count ~ var1 * var2, df) 

the p-value corresponding to likelihood ratio test can be obtained by executing 

anova(mod1,mod0) 

Therefore, you can use apply or lapply (multicore) for repeating this process for all \
genes 

The idea of encapsulating this process in tweeDEglm is that by using 'formulas' you \
can control missing values and guaranty that both models have the same information \
(no missing data in counts or var1 var2). If not, the likelihood ratio test is not \
correct. In addition we also check whether the models you are comparing are nested. \
In other words, you cannot compare 

count ~ var1 + var2 +var3 
vs 
count ~ var1 + var2 + var4 

We'll work on that during next days and upload a new version in the devel 'section' \
that you will be able to use (if not I can send you the package including this new \
feature) 

Hope it helps 

Best, 
Juan 

----- Mensaje original -----

De: "Ryan" <rct@thompsonclan.org> 
Para: jrgonzalez@creal.cat 
CC: "bioconductor" <Bioconductor@r-project.org> 
Enviados: Miércoles, 28 de Agosto 2013 1:20:21 
Asunto: tweeDEseq glm reduced model specification? 

Dear Dr. González, 

I am interested in testing tweeDEseq on my own RNA-seq data. I have noticed that \
tweeDEseq supports glm fitting for more complex designs than simple pairwise models. \
However, it seems that the tweeDEglm function only supports a null model of "count ~ \
1", which makes it impossible to do do things like test for an interaction term by \
comparing "count ~ var1 * var2" against "count ~ var1 + var2". Looking at the code \
for tweeDEseq:::anova.glmPT, it looks like the underlying mechanisms have support fur \
such comparisons. Are there any future plans to support null hypotheses other than \
"count ~ 1" in the tweeDEglm function? 

Thank you, 

-Ryan Thompson 

	[[alternative HTML version deleted]]

_______________________________________________
Bioconductor mailing list
Bioconductor@r-project.org
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

[prev in list] [next in list] [prev in thread] [next in thread]