[prev in list] [next in list] [prev in thread] [next in thread]
List: bioconductor
Subject: Re: [BioC] edgeR - paired samples with multifactorial design - errors
From: Preethy Venkat Ram <preethyvenkataram () gmail ! com>
Date: 2014-04-26 15:25:20
Message-ID: CAJx0VXWghqJzL6ujT1GDbfq9Pp7CzB04-2WTj6rEY_F37w3khA () mail ! gmail ! com
[Download RAW message or body]
Hi Jim and All,
Thanks really a lot for bringing that to my attention.
It didn't work this way as well as I had different number of samples for
each Phenotype unlike the edgeR example which had 3 patients for each
"Disease" category.
But, I took off the empty columns from the design matrix and it worked. Is
the analysis done correctly this way ?
I have one more doubt - Would it be possible to account for batch effect
too with this type of design matrix ?
Like this: design=model.matrix(~Phenotype+Phenotype:pair+
Phenotype:Treat+Phenotype:batch)
Thanks,
Preethy
On Fri, Apr 25, 2014 at 8:00 PM, James W. MacDonald <jmacdon@uw.edu> wrote:
> Hi Preethy,
>
> You need to re-read that section of edgeR, and in particular look at the
> patient column of the revised targets frame.
>
> Best,
>
> Jim
>
>
>
> On 4/25/2014 12:46 PM, Preethy Venkat Ram wrote:
>
>> Hi Jim,
>>
>> Thanks for the reply. I've tried that - with a slighlty modified code. I
>> am sorry. But, I'm getting error again.
>>
>> Here:
>>
>> pair=factor(rep(c(1:45), each=2)).Treat=factor( rep(c("Before",
>> "After"),45), levels=c("Before", "After"))
>> Phenotype=factor(rep(c(1, 1, 2, 2, 1, 1, 1, 2, 1, 1, 1, 2, 1, 2, 2, 2,
>> 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 1, 2,
>> 2, 1, 2, 2), each=2),levels=c("1","2"))
>> design=model.matrix(~Phenotype+Phenotype:pair+Phenotype:Treat)
>> colnames(design)
>> counts.DGEList<-DGEList(counts, group=Treat)
>> y<-calcNormFactors(counts.DGEList)
>> y<-estimateCommonDisp(y, design)
>> y<-estimateGLMTrendedDisp(y, design)
>>
>>
>> The error message here:
>>
>> > y<-estimateGLMTrendedDisp(y, design)
>> Error in return(NA, ntags) : multi-argument returns are not permitted
>> In addition: Warning message:
>> In estimateGLMTrendedDisp.default(y = y$counts, design = design, :
>> No residual df: cannot estimate dispersion
>>
>>
>> One reason I can see is that the number of columns in my design matrix
>> is "92" as they are more replicates/patients than in the 3.5 example. And
>> there are only "90" columns in my "count" matrix.
>>
>> But, do you have any idea how can I solve this ?
>>
>> Thanks,
>> Preethy
>>
>>
>>
>>
>>
>> On Fri, Apr 25, 2014 at 6:45 PM, James W. MacDonald <jmacdon@uw.edu<mailto:
>> jmacdon@uw.edu>> wrote:
>>
>> Hi Preethy,
>>
>> This experiment is very similar to the example in part 3.5 of the
>> edgeR User's guide, starting on page 31.
>>
>> Best,
>>
>> Jim
>>
>>
>>
>> On 4/25/2014 11:29 AM, Preethy Venkat Ram wrote:
>>
>> Hi Devon,
>>
>> Thanks for the replies both on biostars and here. Sorry for
>> crossposting. Both
>> mailing lists and discussion have been very helpful to me.
>>
>> But, I rarely see replies from BioC package maintainers at
>> biostars.
>>
>> All the samples are paired and I have them all correct - I
>> mean none of
>> them are empty
>> I tried different designs. But ending up with the same error
>> message.
>> What I want to do: Getting DE genes between
>> (Phenotype1.Before-Phenotype1.After) &
>> (Phenotype2.Before-Phenotype2.After)
>>
>> pdata here:
>>
>> pair=rep(c(1:45), each=2)
>> Treat=rep(c("Before", "After"),45)
>> Phenotype=rep(c(1, 1, 2, 2, 1, 1, 1, 2, 1, 1, 1, 2, 1, 2, 2,
>> 2, 2, 2, 2,
>> 2, 1, 1, 1, 1, 1, 1, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2,
>> 1, 2, 2, 1, 2,
>> 2), each=2)
>> pdata<-data.frame(pair,Treat,Phenotype)
>>
>>
>> Preethy
>>
>>
>>
>> On Fri, Apr 25, 2014 at 4:53 PM, Devon Ryan <dpryan@dpryan.com
>> <mailto:dpryan@dpryan.com>> wrote:
>>
>> Hi Preethy,
>>
>> You likely want:
>>
>> design=model.matrix(~pair+Treat:Phenotype, data=pdata)
>>
>> If that still yields the error, then you'll need to share
>> "pdata" or
>> "design". Also, please don't crosspost on both this list
>> and biostars (
>> https://www.biostars.org/p/98907/), it duplicates the
>> community effort.
>>
>> Devon
>>
>>
>> --
>> Devon Ryan, Ph.D.
>> Email: dpryan@dpryan.com <mailto:dpryan@dpryan.com>
>>
>> Laboratory for Molecular and Cellular Cognition
>> German Centre for Neurodegenerative Diseases (DZNE)
>> Ludwig-Erhard-Allee 2
>> 53175 Bonn
>> Germany
>> <devon.ryan@dzne.de <mailto:devon.ryan@dzne.de>>
>>
>>
>>
>>
>> On Fri, Apr 25, 2014 at 11:15 AM, Preethy [guest]
>> <guest@bioconductor.org <mailto:guest@bioconductor.org
>> >>wrote:
>>
>>
>> Hi All,
>>
>> I had been trying to do DE analysis of my RNAseq
>> experiment using edgeR
>> and am having some isssues. The details of the
>> Experiment and the R code I
>> tried below:
>>
>> (a) Paired experimental design with 45 pairs
>> (b) Treatment: "Before" and "After"
>> (c) Phenotype: 1 & 2
>> Aim: Look for DE genes between Phenotype 1 and 2 upon
>> treatment taking
>> into account the paired design
>>
>> The R code tried:
>>
>> library(edgeR)
>> counts<-read.delim(file="counts.dat",header=T)
>> pair=factor(pdata$pair)
>> Treat=factor( pdata$treat)
>> Phenotype=factor(pdata$pheno)
>> group<-paste(Treat,Phenotype,sep=".")
>> design=model.matrix(~pair+Treat:Phenotype, data=counts)
>> counts.DGEList<-DGEList(counts, group=group)
>> y<-calcNormFactors(counts.DGEList)
>> y<-estimateCommonDisp(y, design)
>> y<-estimateGLMTrendedDisp(y, design)
>>
>>
>> Error message I get:
>>
>> Error in glmFit.default(y, design = design, dispersion
>> = dispersion,
>> offset = offset, :
>> Design matrix not of full rank. The following
>> coefficients not
>> estimable:
>> TreatBefore:Phenotype1 TreatBefore:Phenotype2
>>
>>
>> Any idea to solve this out?
>>
>> Thanks,
>> Preethy
>>
>> -- output of sessionInfo():
>>
>> R version 3.1.0 (2014-04-10)
>> Platform: x86_64-pc-linux-gnu (64-bit)
>>
>> locale:
>> [1] LC_CTYPE=fi_FI.UTF-8 LC_NUMERIC=C
>> [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
>> [5] LC_MONETARY=fi_FI.UTF-8 LC_MESSAGES=en_GB.UTF-8
>> [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C
>> [9] LC_ADDRESS=C LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] stats graphics grDevices utils datasets
>> methods base
>>
>> other attached packages:
>> [1] edgeR_3.4.2 limma_3.18.13
>>
>>
>> --
>> Sent via the guest posting facility at
>> bioconductor.org <http://bioconductor.org>.
>>
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor@r-project.org
>> <mailto:Bioconductor@r-project.org>
>>
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.
>> conductor
>>
>>
>> [[alternative HTML version deleted]]
>>
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor@r-project.org <mailto:Bioconductor@r-project.org>
>>
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>>
>> -- James W. MacDonald, M.S.
>> Biostatistician
>> University of Washington
>> Environmental and Occupational Health Sciences
>> 4225 Roosevelt Way NE, # 100
>> Seattle WA 98105-6099
>>
>>
>>
> --
> James W. MacDonald, M.S.
> Biostatistician
> University of Washington
> Environmental and Occupational Health Sciences
> 4225 Roosevelt Way NE, # 100
> Seattle WA 98105-6099
>
>
[[alternative HTML version deleted]]
_______________________________________________
Bioconductor mailing list
Bioconductor@r-project.org
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic