[prev in list] [next in list] [prev in thread] [next in thread] 

List:       bioconductor
Subject:    Re: [BioC] edgeR - paired samples with multifactorial design - errors
From:       Preethy Venkat Ram <preethyvenkataram () gmail ! com>
Date:       2014-04-26 15:25:20
Message-ID: CAJx0VXWghqJzL6ujT1GDbfq9Pp7CzB04-2WTj6rEY_F37w3khA () mail ! gmail ! com
[Download RAW message or body]

Hi Jim and All,

Thanks really a lot for bringing that to my attention.

It didn't work this way as well as I had different number of samples for
each Phenotype unlike the edgeR example which had 3 patients for each
"Disease" category.

But, I took off the empty columns from the design matrix and it worked. Is
the analysis done correctly this way ?

I have one more doubt - Would it be possible to account for batch effect
too with this type of design matrix ?

Like this:   design=model.matrix(~Phenotype+Phenotype:pair+
Phenotype:Treat+Phenotype:batch)

Thanks,
Preethy



On Fri, Apr 25, 2014 at 8:00 PM, James W. MacDonald <jmacdon@uw.edu> wrote:

> Hi Preethy,
>
> You need to re-read that section of edgeR, and in particular look at the
> patient column of the revised targets frame.
>
> Best,
>
> Jim
>
>
>
> On 4/25/2014 12:46 PM, Preethy Venkat Ram wrote:
>
>> Hi Jim,
>>
>> Thanks for the reply. I've tried that -  with a slighlty modified code. I
>> am sorry. But, I'm getting error again.
>>
>> Here:
>>
>> pair=factor(rep(c(1:45), each=2)).Treat=factor( rep(c("Before",
>> "After"),45), levels=c("Before", "After"))
>> Phenotype=factor(rep(c(1, 1, 2, 2, 1, 1, 1, 2, 1, 1, 1, 2, 1,  2, 2, 2,
>> 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 2, 2, 1,  2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 1, 2,
>> 2, 1, 2, 2), each=2),levels=c("1","2"))
>> design=model.matrix(~Phenotype+Phenotype:pair+Phenotype:Treat)
>> colnames(design)
>> counts.DGEList<-DGEList(counts, group=Treat)
>> y<-calcNormFactors(counts.DGEList)
>> y<-estimateCommonDisp(y, design)
>> y<-estimateGLMTrendedDisp(y, design)
>>
>>
>> The error message here:
>>
>> > y<-estimateGLMTrendedDisp(y, design)
>> Error in return(NA, ntags) : multi-argument returns are not permitted
>> In addition: Warning message:
>> In estimateGLMTrendedDisp.default(y = y$counts, design = design,  :
>>   No residual df: cannot estimate dispersion
>>
>>
>> One reason I can see is that the number of columns in my design matrix
>>  is "92"  as they are more replicates/patients than in the 3.5 example. And
>> there are only "90" columns in my "count" matrix.
>>
>> But, do you have any idea how can I solve this ?
>>
>> Thanks,
>> Preethy
>>
>>
>>
>>
>>
>> On Fri, Apr 25, 2014 at 6:45 PM, James W. MacDonald <jmacdon@uw.edu<mailto:
>> jmacdon@uw.edu>> wrote:
>>
>>     Hi Preethy,
>>
>>     This experiment is very similar to the example in part 3.5 of the
>>     edgeR User's guide, starting on page 31.
>>
>>     Best,
>>
>>     Jim
>>
>>
>>
>>     On 4/25/2014 11:29 AM, Preethy Venkat Ram wrote:
>>
>>         Hi Devon,
>>
>>         Thanks for the replies both on biostars and here.  Sorry for
>>         crossposting.  Both
>>         mailing lists and discussion have been very helpful to me.
>>
>>         But, I rarely see replies from BioC package maintainers at
>>         biostars.
>>
>>         All the samples are paired and I have them all correct - I
>>         mean none of
>>         them are empty
>>         I tried different designs. But ending up with the same error
>>         message.
>>         What I want to do: Getting DE genes between
>>         (Phenotype1.Before-Phenotype1.After) &
>>         (Phenotype2.Before-Phenotype2.After)
>>
>>         pdata here:
>>
>>         pair=rep(c(1:45), each=2)
>>         Treat=rep(c("Before", "After"),45)
>>         Phenotype=rep(c(1, 1, 2, 2, 1, 1, 1, 2, 1, 1, 1, 2, 1,  2, 2,
>>         2, 2, 2, 2,
>>         2, 1, 1, 1, 1, 1, 1, 2, 2, 1,  2, 2, 2, 2, 2, 2, 2, 2, 1, 2,
>>         1, 2, 2, 1, 2,
>>         2), each=2)
>>         pdata<-data.frame(pair,Treat,Phenotype)
>>
>>
>>         Preethy
>>
>>
>>
>>         On Fri, Apr 25, 2014 at 4:53 PM, Devon Ryan <dpryan@dpryan.com
>>         <mailto:dpryan@dpryan.com>> wrote:
>>
>>             Hi Preethy,
>>
>>             You likely want:
>>
>>             design=model.matrix(~pair+Treat:Phenotype, data=pdata)
>>
>>             If that still yields the error, then you'll need to share
>>             "pdata" or
>>             "design". Also, please don't crosspost on both this list
>>             and biostars (
>>             https://www.biostars.org/p/98907/), it duplicates the
>>             community effort.
>>
>>             Devon
>>
>>
>>             --
>>             Devon Ryan, Ph.D.
>>             Email: dpryan@dpryan.com <mailto:dpryan@dpryan.com>
>>
>>             Laboratory for Molecular and Cellular Cognition
>>             German Centre for Neurodegenerative Diseases (DZNE)
>>             Ludwig-Erhard-Allee 2
>>             53175 Bonn
>>             Germany
>>             <devon.ryan@dzne.de <mailto:devon.ryan@dzne.de>>
>>
>>
>>
>>
>>             On Fri, Apr 25, 2014 at 11:15 AM, Preethy [guest]
>>             <guest@bioconductor.org <mailto:guest@bioconductor.org
>> >>wrote:
>>
>>
>>                 Hi All,
>>
>>                 I had been trying to do DE analysis of my RNAseq
>>                 experiment using edgeR
>>                 and am having some isssues. The details of the
>>                 Experiment and the R code I
>>                 tried below:
>>
>>                 (a) Paired experimental design with 45 pairs
>>                 (b) Treatment: "Before" and "After"
>>                 (c) Phenotype: 1 & 2
>>                 Aim: Look for DE genes between Phenotype 1 and 2 upon
>>                 treatment taking
>>                 into account the paired design
>>
>>                 The R code tried:
>>
>>                 library(edgeR)
>>                 counts<-read.delim(file="counts.dat",header=T)
>>                 pair=factor(pdata$pair)
>>                 Treat=factor( pdata$treat)
>>                 Phenotype=factor(pdata$pheno)
>>                 group<-paste(Treat,Phenotype,sep=".")
>>                 design=model.matrix(~pair+Treat:Phenotype, data=counts)
>>                 counts.DGEList<-DGEList(counts, group=group)
>>                 y<-calcNormFactors(counts.DGEList)
>>                 y<-estimateCommonDisp(y, design)
>>                 y<-estimateGLMTrendedDisp(y, design)
>>
>>
>>                 Error message I get:
>>
>>                 Error in glmFit.default(y, design = design, dispersion
>>                 = dispersion,
>>                 offset = offset,  :
>>                    Design matrix not of full rank.  The following
>>                 coefficients not
>>                 estimable:
>>                   TreatBefore:Phenotype1 TreatBefore:Phenotype2
>>
>>
>>                 Any idea to solve this out?
>>
>>                 Thanks,
>>                 Preethy
>>
>>                   -- output of sessionInfo():
>>
>>                 R version 3.1.0 (2014-04-10)
>>                 Platform: x86_64-pc-linux-gnu (64-bit)
>>
>>                 locale:
>>                   [1] LC_CTYPE=fi_FI.UTF-8       LC_NUMERIC=C
>>                   [3] LC_TIME=en_GB.UTF-8  LC_COLLATE=en_GB.UTF-8
>>                   [5] LC_MONETARY=fi_FI.UTF-8  LC_MESSAGES=en_GB.UTF-8
>>                   [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C
>>                   [9] LC_ADDRESS=C LC_TELEPHONE=C
>>                 [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
>>
>>                 attached base packages:
>>                 [1] stats     graphics  grDevices utils datasets
>>                  methods   base
>>
>>                 other attached packages:
>>                 [1] edgeR_3.4.2   limma_3.18.13
>>
>>
>>                 --
>>                 Sent via the guest posting facility at
>>                 bioconductor.org <http://bioconductor.org>.
>>
>>
>>                 _______________________________________________
>>                 Bioconductor mailing list
>>                 Bioconductor@r-project.org
>>                 <mailto:Bioconductor@r-project.org>
>>
>>                 https://stat.ethz.ch/mailman/listinfo/bioconductor
>>                 Search the archives:
>>                 http://news.gmane.org/gmane.science.biology.informatics.
>> conductor
>>
>>
>>                 [[alternative HTML version deleted]]
>>
>>
>>         _______________________________________________
>>         Bioconductor mailing list
>>         Bioconductor@r-project.org <mailto:Bioconductor@r-project.org>
>>
>>         https://stat.ethz.ch/mailman/listinfo/bioconductor
>>         Search the archives:
>>         http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>>
>>     --     James W. MacDonald, M.S.
>>     Biostatistician
>>     University of Washington
>>     Environmental and Occupational Health Sciences
>>     4225 Roosevelt Way NE, # 100
>>     Seattle WA 98105-6099
>>
>>
>>
> --
> James W. MacDonald, M.S.
> Biostatistician
> University of Washington
> Environmental and Occupational Health Sciences
> 4225 Roosevelt Way NE, # 100
> Seattle WA 98105-6099
>
>

	[[alternative HTML version deleted]]

_______________________________________________
Bioconductor mailing list
Bioconductor@r-project.org
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic