'Re: [BioC] EdgeR: getting CPM values after batch effect correction'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       bioconductor
Subject:    Re: [BioC] EdgeR: getting CPM values after batch effect correction
From:       "Ryan C. Thompson" <rct () thompsonclan ! org>
Date:       2013-08-29 7:00:43
Message-ID: 521EF19B.5030701 () thompsonclan ! org
[Download RAW message or body]

Hi Gilgi,

All you have to do is remove "batch" from the formula in your call to 
"model.matrix", and using the resulting matrix in your call to 
removeBatchEffect. Note, however, that for differential expression 
testing, you want to use the design matrix that you're already using and 
avoid removeBatchEffect.

For combining batch effect removal with the variance stabilizing 
transformation, I believe you would first apply the VST to get 
normalized, variance-stabilized logCPM, and then use removeBatchEffect 
on the result.

-Ryan

P.S. Please use "Reply All" when you reply to posts in order to ensure 
that the whole mailing list sees your reply.

On 08/28/2013 11:08 PM, Gilgi Friedlander wrote:
> Hi Ryan,
> 
> Thanks a lot for your reply.
> 
> The reason I used estimate dispersions is that I wanted to do also differential \
> expression analysis. 
> Regarding the removeBatchEffect - I am not sure I understand how to define the \
> design and then how to define the batch. Do you happen to have an example how to \
> use it? 
> Thanks for the variance-stabilizing transformation tip. Would it be correct to pass \
> it as the matrix given to removeBatchEffect? 
> Thanks a lot,
> Gilgi
> 
> -----Original Message-----
> From: Ryan C. Thompson [mailto:rct@thompsonclan.org]
> Sent: Wednesday, August 28, 2013 11:02 AM
> To: Gilgi Friedlander
> Cc: bioconductor
> Subject: Re: [BioC] EdgeR: getting CPM values after batch effect correction
> 
> Hi Gilgi,
> 
> If you're just using edgeR for the CPM calculation, I believe there's no need to \
> estimate dispersions; they won't be used in calculating the CPM values. You may \
> wish to use normalized.lib.sizes=TRUE in the call to cpm to get CPM values that \
> take into account the normalization factors computed by calcNormFactors (otherwise \
> there's no point in doing that either). 
> Second, according to the help text for removeBatchEffect, you're not supposed to \
> include the batch effect in the design matrix. The design matrix should include all \
> the experimental variables, while the batch variable should indicate the technical \
> batching. 
> Finally, instead of doing a simple logCPM transform, you might also try the \
> variance-stabilizing transformation provided by the DESeq package, which is \
> intended for clustering and machine learning types of analyses. 
> -Ryan
> 
> On 08/27/2013 10:52 PM, Gilgi Friedlander wrote:
> > Hi Ryan,
> > 
> > Thanks a lot for the reply.
> > 
> > I followed EdgeR user's manual, and defined a model:
> > y1<-DGEList(counts=countdata1,group=batch)
> > y1<- calcNormFactors( y1 )
> > design1 <- model.matrix(~batch+Treat1)
> > 
> > batch has values 1 or 2, according to the batch of the experiment that was done.
> > 
> > Treat has 10 different samples.
> > 
> > In order to define a contrast I did the following:
> > y1 <- estimateGLMCommonDisp(y1, design1, verbose=TRUE)#Now we are
> > ready to construct an edgeR specific
> > y1 <- estimateGLMTrendedDisp(y1, design1)
> > y1 <- estimateGLMTagwiseDisp(y1, design1) lrt <-
> > glmLRT(fit,contrast=c(0,0,1,-1,0,0,0,0,0,0,0,0,0,0))
> > 
> > I want now to get also the log counts after removal of the batch effect (for the \
> > purpose of clustering of the genes). 
> > Is it correct to obtain the batch removed log counts in the following way:
> > 
> > logCPM <- cpm(y1, log=TRUE, prior.count=3)
> > logCPM_batchRemoved<-removeBatchEffect(logCPM,batch=batch,design=desig
> > n1)
> > 
> > Many thanks,
> > Gilgi
> > 
> > -----Original Message-----
> > From: Ryan [mailto:rct@thompsonclan.org]
> > Sent: Wednesday, August 28, 2013 2:46 AM
> > To: Gilgi Friedlander
> > Cc: bioconductor@r-project.org
> > Subject: Re: [BioC] EdgeR: getting CPM values after batch effect
> > correction
> > 
> > You would have to define batch correction. Are you talking about fitting a model \
> > of the form "~ experimetalVar + batchEffect" and then subtracting out the batch \
> > effect coefficient? 
> > On Sun Aug 25 11:11:55 2013, Gilgi Friedlander wrote:
> > > Dear list,
> > > 
> > > In edgeR, it possible to get CPM values after batch effect correction (and \
> > > after TMM normalization)? 
> > > Thanks a lot!
> > > 
> > > 	[[alternative HTML version deleted]]
> > > 
> > > _______________________________________________
> > > Bioconductor mailing list
> > > Bioconductor@r-project.org
> > > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > > Search the archives:
> > > http://news.gmane.org/gmane.science.biology.informatics.conductor

_______________________________________________
Bioconductor mailing list
Bioconductor@r-project.org
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: \
http://news.gmane.org/gmane.science.biology.informatics.conductor

[prev in list] [next in list] [prev in thread] [next in thread]