[prev in list] [next in list] [prev in thread] [next in thread]
List: bioconductor
Subject: Re: [BioC] tweeDEseq glm reduced model specification?
From: "Gonzalez Ruiz, Juan Ramon" <jrgonzalez () creal ! cat>
Date: 2013-08-29 8:51:02
Message-ID: 1304158010.172921.1377766262187.JavaMail.zimbra () creal ! cat
[Download RAW message or body]
Dear Ryan,
Thank you very much for posting this interesting question.
As you are pointing out, current version of function tweeDEglm only supports the \
comparison
count ~ 1
vs
count ~ var1
where var1 encodes for a grouping variable (e.g., cases vs control)
Few changes have to be performed in order to address the comparison you are proposing \
count ~ var1*var2
count ~ var1 + var2
We plan to add this new feature by adding two new arguments in the function tweeDEglm \
(modelNull and modelTest).
HOWEVER, you can do such comparison using using glmPT function
(in fact tweeDEglm calls this function gene by gene ans uses lapply to improve time \
computation - the idea of encapsulating this repeated process in a function is also \
due to the fact that some requirements have to be full-filled in order to be able to \
compare both models. Please, see my comment at the end of this message)
this is an example for analyzing 1 gene (count) for the model you are interested in:
library(tweeDEseqCountData)
data(pickrell1)
counts <- exprs(pickrell1.eset)
count <- counts[1,]
# simulate two variables for illustrating purposes
n <- ncol(counts)
var1 <- sample(c("A", "B"), n, rep=TRUE)
var2 <- as.factor(sample(c("A", "B"), n, rep=TRUE))
df <- as.factor(data.frame(var1=var1, var2=var2))
mod0 <- glmPT(count ~ var1 + var2, df)
mod1 <- glmPT(count ~ var1 * var2, df)
the p-value corresponding to likelihood ratio test can be obtained by executing
anova(mod1,mod0)
Therefore, you can use apply or lapply (multicore) for repeating this process for all \
genes
The idea of encapsulating this process in tweeDEglm is that by using 'formulas' you \
can control missing values and guaranty that both models have the same information \
(no missing data in counts or var1 var2). If not, the likelihood ratio test is not \
correct. In addition we also check whether the models you are comparing are nested. \
In other words, you cannot compare
count ~ var1 + var2 +var3
vs
count ~ var1 + var2 + var4
We'll work on that during next days and upload a new version in the devel 'section' \
that you will be able to use (if not I can send you the package including this new \
feature)
Hope it helps
Best,
Juan
----- Mensaje original -----
De: "Ryan" <rct@thompsonclan.org>
Para: jrgonzalez@creal.cat
CC: "bioconductor" <Bioconductor@r-project.org>
Enviados: Miércoles, 28 de Agosto 2013 1:20:21
Asunto: tweeDEseq glm reduced model specification?
Dear Dr. González,
I am interested in testing tweeDEseq on my own RNA-seq data. I have noticed that \
tweeDEseq supports glm fitting for more complex designs than simple pairwise models. \
However, it seems that the tweeDEglm function only supports a null model of "count ~ \
1", which makes it impossible to do do things like test for an interaction term by \
comparing "count ~ var1 * var2" against "count ~ var1 + var2". Looking at the code \
for tweeDEseq:::anova.glmPT, it looks like the underlying mechanisms have support fur \
such comparisons. Are there any future plans to support null hypotheses other than \
"count ~ 1" in the tweeDEglm function?
Thank you,
-Ryan Thompson
[[alternative HTML version deleted]]
_______________________________________________
Bioconductor mailing list
Bioconductor@r-project.org
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic