[prev in list] [next in list] [prev in thread] [next in thread] 

List:       bioconductor
Subject:    Re: [BioC] RE :  designing an eSet derived object
From:       Martin Morgan <mtmorgan () fhcrc ! org>
Date:       2010-11-25 19:31:28
Message-ID: 4CEEB990.3000303 () fhcrc ! org
[Download RAW message or body]

On 11/24/2010 05:47 AM, Wolfgang RAFFELSBERGER wrote:
> Dear Martin,
> 
> thank's again - I've got things working as you explained.
> 
> Just to make sure I completely understood: Now everything is
> streamlined for the storage of the multiple ExperssionSets for the
> various methods employed (the 1st slot in my GxSet). The next step is
> then to review how I'm storing the "derived" data (eg averages,
> SEM,...   for each of the methods from above).  Here I've tried a few
> things, but as far as I understand, there is no already existing
> class close enough to my case (ideally a "SimpleListList" = list of
> SimpleLists). So I made a new class containing multiple SimpleList
> objects (code below) :
> 
> setClass("GxAvData",representation(avSI="SimpleList",expressed="SimpleList",SEM="SimpleList",
>  
> FC="SimpleList",FiltFin="SimpleList",FiltSI="SimpleList",FiltOther="SimpleList"))
> 
> 
> I've also tried to use the SimpleMatrixList object since all my
> (final) data are nothing but matrixes, but I didn't get this working.
> Does this matter much ? Or should I rather define a general
> "SimpleListList" (list of SimpleLists) first, to decline my specific
> class ("GxAvData") of this ?

It seems like your class has a well-defined number of 'SimpleList'
slots, so your setClass above seems appropriate.

If I

setClass("SimpleMatrixList", contains="SimpleList",
         prototype=prototype(elementType="matrix"))
SimpleMatrixList <-
    function(...) new("SimpleMatrixList", listData=list(...))

things seem to work?

> mlst <- SimpleMatrixList(a=matrix(0, 5, 5), b=matrix(1, 5, 5))
> mlst[["b"]]
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    1    1    1    1
[2,]    1    1    1    1    1
[3,]    1    1    1    1    1
[4,]    1    1    1    1    1
[5,]    1    1    1    1    1
> mlst <- SimpleMatrixList(c=data.frame())
Error in validObject(.Object) :
  invalid class "SimpleMatrixList" object: the 'listData' slot must be a
list containing matrix objects

Martin

> 
> 
> Thanks for all your helpful comments,
> 
> Wolfgang
> 
> PS: Hope you had a good travel back to the US.
> 
> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
> . Wolfgang Raffelsberger, PhD Laboratoire de BioInformatique et
> Génomique Intégratives IGBMC, 1 rue Laurent Fries,  67404 Illkirch
> Strasbourg,  France Tel (+33) 388 65 3300         Fax (+33) 388 65
> 3276 wolfgang.raffelsberger (at ) igbmc.fr
> 
> ________________________________________ De :
> bioconductor-bounces@stat.math.ethz.ch
> [bioconductor-bounces@stat.math.ethz.ch] de la part de Martin Morgan
> [mtmorgan@fhcrc.org] Date d'envoi : lundi 22 novembre 2010 19:42 À :
> Wolfgang RAFFELSBERGER Cc : bioconductor@stat.math.ethz.ch Objet :
> Re: [BioC] RE :  designing an eSet derived object
> 
> Hi Wolfgang --
> 
> On 11/22/2010 03:44 AM, Wolfgang RAFFELSBERGER wrote:
> > Dear Martin,
> > 
> > thank you very much for your helpful input. I'm sorry I have to
> > bug you again.
> 
> > I was about there, but at the recent Bioconductor Developer Meeting
> > I got another intersting suggestion, which I haven't succeded 
> > implementing.
> 
> > Briefly, (if I understood right) the idea was rather to make a 
> > modified SimpleList class where I could check that each elment is
> > an expression set  (instead of using the SimpleList class as is).
> > From there one might even go one step further and check if all
> > dimensions are identical, too ...
> > 
> > For the making the modified SimpleList I returned to the help 
> > provided in the Bioconductor pdf "Biobase development and the new 
> > eSet". But it seems I'm not getting the inizialization right.
> 
> > My 'problem' is, that I don't want to fix in advance how many 
> > ExperssionSets will be put in the list (SimpleList), neither what 
> > their names will be.  This way I hope the object will be 
> > sufficienltly general to hold results from normalization-methods
> > that might become available in the future. Now, this is now quite 
> > different to the example provided in  "Biobase development and the 
> > new eSet".
> > 
> > To link to my previous post: This (modified) SimpleList will then
> > be used as a slot (allowing to store data normalized by multiple 
> > methods) of another new class (the "GxSet"), plus in other slots
> > for data-derived values (averages, etc) and more
> > documentation/notes)...
> > 
> > Thank's in advance fro any hints, Wolfgang
> 
> > 
> > 
> > > 
> > > require(Biobase); require(IRanges); require(affy) # the toy data 
> > > eset1 <- new("ExpressionSet", exprs=matrix(1,10,4)) pData(eset1)
> > > <- data.frame("class"=c(1,2,2,2))
> > > 
> > > eset2 <- new("ExpressionSet", exprs=matrix(3,10,4)) pData(eset2)
> > > <- data.frame("class"=c(1,2,2,2))
> > > 
> > > # making the modified class 
> > > setClass("GxSimpleList",contains="SimpleList")
> 
> I think the idea is
> 
> setClass("SimpleExpressionSetList", contains="SimpleList", 
> prototype=prototype(elementType="ExpressionSet"))
> 
> and then you're done...
> 
> > listData1 <- list(A=new("ExpressionSet"), B=new("ExpressionSet")) 
> > listData2 <- list(A=new("ExpressionSet"), B=matrix()) 
> > new("SimpleExpressionSetList", listData=listData1)
> SimpleExpressionSetList of length 2 names(2): A B
> > new("SimpleExpressionSetList", listData=listData2)
> Error in validObject(.Object) : invalid class
> "SimpleExpressionSetList" object: the 'listData' slot must be a list
> containing ExpressionSet objects
> > 
> 
> > [1] "GxSimpleList"
> > > getClass("GxSimpleList")
> > Class "GxSimpleList" [in ".GlobalEnv"]
> > 
> > Slots:
> > 
> > Name:         listData elementMetadata     elementType metadata
> > Class:            list             ANY       character list
> > 
> > Extends: Class "SimpleList", directly Class "Sequence", by class 
> > "SimpleList", distance 2 Class "Annotated", by class "SimpleList", 
> > distance 3
> > > 
> > > # for the "initialize" I didn't understand how to formulate it
> > > in my case (as I don't know how many elements, neither their
> > > names) setMethod("initialize","GxSimpleList",
> > > function(.object,...) listData =
> > > listDataNew(lapply(list(.object,...) == "ExpressionSet") ))
> > Error in conformMethod(signature, mnames, fnames, f, fdef, 
> > definition) : in method for ‘initialize’ with signature 
> > ‘.Object="GxSimpleList"’: formal arguments (.Object =
> > "GxSimpleList", ... = "GxSimpleList") omitted in the method
> > definition cannot be in the signature
> > > 
> > > setMethod("initialize","GxSimpleList", function(.object,...) 
> > > {.object <- callNextMethod(.object,...)})
> > Error in conformMethod(signature, mnames, fnames, f, fdef, 
> > definition) : in method for ‘initialize’ with signature 
> > ‘.Object="GxSimpleList"’: formal arguments (.Object =
> > "GxSimpleList", ... = "GxSimpleList") omitted in the method
> > definition cannot be in the signature
> > > 
> > > # I guess the check for experssionSets should go into validity 
> > > setValidity("GxSimpleList", function(object) {   # experimetal
> > +    if(sum(!(unlist(lapply(object,function(x) class(x))) %in% 
> > "ExpressionSet")) >0) "A 'GxSimpleList' object should contain 
> > elements of class 'ExpressionSet' only !" +    #same as ?# 
> > assayDataValidMembers(class(object), 
> > rep("ExpressionSet",length(object))) +    }) Class "GxSimpleList"
> > [in ".GlobalEnv"]
> > 
> > Slots:
> > 
> > Name:         listData elementMetadata     elementType metadata
> > Class:            list             ANY       character list
> > 
> > Extends: Class "SimpleList", directly Class "Sequence", by class 
> > "SimpleList", distance 2 Class "Annotated", by class "SimpleList", 
> > distance 3
> > > 
> > > # what happens .. lst1 = SimpleList(a=eset1, b=eset2)   # OK
> > > 
> > > lst2 = new("GxSimpleList",a=eset1, b=eset2)  # error (due to 
> > > missing "initialize" ?)
> > Error in initialize(value, ...) : invalid names for slots of class 
> > "GxSimpleList": a, b
> > > lst3 = GxSimpleList(a=eset1, b=eset2)        # error (due to 
> > > missing "initialize" ?)
> > Error: could not find function "GxSimpleList"
> > > 
> > > # for completeness ... sessionInfo()
> > R version 2.12.0 (2010-10-15) Platform: i386-pc-mingw32/i386 
> > (32-bit)
> > 
> > locale: [1] LC_COLLATE=French_France.1252 
> > LC_CTYPE=French_France.1252    LC_MONETARY=French_France.1252 
> > LC_NUMERIC=C [5] LC_TIME=French_France.1252
> > 
> > attached base packages: [1] grDevices datasets  splines   graphics 
> > stats     tcltk     utils     methods   base
> > 
> > other attached packages: [1] affy_1.28.0     IRanges_1.8.0 
> > Biobase_2.10.0  svSocket_0.9-50 TinnR_1.0.3     R2HTML_2.2 
> > Hmisc_3.8-3     survival_2.35-8
> > 
> > loaded via a namespace (and not attached): [1] affyio_1.18.0 
> > cluster_1.13.1        grid_2.12.0           lattice_0.19-13 
> > preprocessCore_1.12.0 svMisc_0.9-60 [7] tools_2.12.0
> > > 
> > 
> > 
> > 
> > . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
> > . . Wolfgang Raffelsberger, PhD Laboratoire de BioInformatique et 
> > Génomique Intégratives IGBMC, 1 rue Laurent Fries,  67404 Illkirch 
> > Strasbourg,  France Tel (+33) 388 65 3300         Fax (+33) 388 65 
> > 3276 wolfgang.raffelsberger (at ) igbmc.fr
> > 
> > ________________________________________ De : Martin Morgan 
> > [mtmorgan@fhcrc.org] Date d'envoi : vendredi 5 novembre 2010 18:33
> > À : Wolfgang RAFFELSBERGER Cc : bioconductor@stat.math.ethz.ch
> > Objet : Re: [BioC] designing an eSet derived object
> > 
> > On 11/05/2010 05:02 AM, Wolfgang RAFFELSBERGER wrote:
> > > Dear list,
> > > 
> > 
> > > basically I'm trying to design an object to contain the
> > > following microarray-data 1) "gxIndData": microarray-data
> > > normalized in parallel by (an array-dependent) number of n
> > > methods plus the corresponding expression-calls (again, <= n
> > > methods), 2) "gxAvData": derived values (replicate-averages,
> > > SEMs, etc), 3) gene/spot annotation, 4) sample-description, 5)
> > > various supl informations (parameters, notes, versions, etc)
> > > 
> > > In overall, this is a somehow modified/extended concept to the 
> > > Biobase eSet and I'm trying to figure out if there is a way to
> > > use the Biobase eSet. This way I hope to maintain a decent level
> > > of compatibility with other Bioconductor methods and allow 
> > > code-reuse.
> > > 
> > > Now I'd like to store  the various sections of 1) and 2) as 
> > > separate lists with n matrixes of values to keep things
> > > organized.
> > > 
> > > According to the Vignette "Biobase development and the new eSet" 
> > > section 5 ("Extending eSet"), I defined new a new class 'eSet'. 
> > > But as soon as I integrate something different than matrixes at
> > > the level of 'AssayData', I get an error-message (see code below)
> > > - no matter if these are simply lists or custom-objects. I
> > > suppose this means that I would have to store all matrixes (up to
> > > 10*6methods =60 matrixes) without further organization at the
> > > level of 'AssayData'.
> > 
> > eSet requires that all AssayData elements are two-dimensional with 
> > identical dimensions, so a list-of-matrices would not work.
> > 
> > > However, I'd like to keep at least one (in my case better 2) 
> > > levels of additional arborescence to keep the data organized.
> > > 
> > > So, finally I would like to integrate two new classes for 1) and 
> > > 2) at the level of the assayData slot of my modified/new eSet.
> > > 
> > > Does this mean this is not possible and that I cannot use the 
> > > 'eSet' for my purposes ? Do I have to create a novel class
> > > somehow equivalent but finally incompatible to the 'eSet' ?
> > > 
> > > Any suggestions/hints ?
> > 
> > One possiblity, if this is for your own use and not as the 
> > foundation for a package, is to use NChannelSet, where each method
> > is a 'channel'.
> > 
> > Another possibility is to create a class that extends eSet with a 
> > slot containing, e.g., an AnnotatedDataFrame with columns
> > describing the AssayData, and a method to query the slot / select
> > the appropriate assayData elements
> > 
> > And perhaps what you really have is more a list of (of lists of) 
> > ExpressionSets, each element of the list with additional
> > information. An approach here would use the IRanges 'SimpleList'
> > infrastructure, e.g.,
> > 
> > > lst = SimpleList(a=new("ExpressionSet"), b=new("ExpressionSet")) 
> > > elementMetadata(lst) = DataFrame(method=c("A", "B")) 
> > > lst[elementMetadata(lst)$method == "A"]
> > SimpleList of length 1 names(1): a
> > > lst[elementMetadata(lst)$method == "A"][[1]]
> > ExpressionSet (storageMode: lockedEnvironment) assayData: 0
> > features, 0 samples element names: exprs protocolData: none
> > phenoData: none featureData: none experimentData: use
> > 'experimentData(object)' Annotation:
> > 
> > Martin
> > 
> > > 
> > > Thank’s in advance, wolfgang
> > > 
> > > ##
> > > 
> > > require(Biobase) setClass("gxSet", contains = "eSet") 
> > > setMethod("initialize", "gxSet", function(.Object, 
> > > A=new("list"),B=new("list"),...) { callNextMethod(.Object,
> > > A=A,B=B, ...) }) new("gxSet") ## produces : Error in function
> > > (storage.mode = c("lockedEnvironment", "environment",  :
> > > 'AssayData' elements with invalid dimensions: 'A' 'B'
> > > 
> > > 
> > > ## ideally I'd like to use 
> > > setClass("gxIndData",representation(SIdata="list",SIcall="list"))
> > > 
> > > 
setClass("gxAvData",representation(avSI="list",expressed="list",SEM="list",
> > > conCall="list", 
> > > FC="list",FiltFin="list",FiltSI="list",FiltOther="list")) 
> > > setClass("gxSet", contains = "eSet")
> > > 
> > > setMethod("initialize","gxSet", function(.Object, 
> > > assayData=assayDataNew(IndData=IndData,AvData=AvData), 
> > > IndData=new("gxIndData"), AvData=new("gxAvData"),...) { 
> > > if(!missing(assayData) && any(!missing(IndData),
> > > !missing(AvData))) { warning("using 'assayData'; ignoring
> > > 'IndData', 'AvData'") } callNextMethod(.Object, assayData =
> > > assayData, ...) })
> > > 
> > > new("gxSet") ## produces : Error in assayDataNew(IndData =
> > > IndData, AvData = AvData) : 'AssayData' elements with invalid
> > > dimensions: 'AvData' 'IndData'
> > > 
> > > 
> > > ## the alternative : an eSet 'like' but independent and 
> > > incompatible object .. 
> > > setClass("gxSet",representation(IndData="gxIndData",AvData="gxAvData",phenoData="AnnotatedDataFrame",featureData="AnnotatedDataFrame",
> > >  
> > > 
> 
> > > 
experimentData="MIAME",annotation="character",protocolData="AnnotatedDataFrame",notes="list"))

> > > 
> > > 
> > > 
> > > ## for completeness: sessionInfo() R version 2.12.0 (2010-10-15) 
> > > Platform: i386-pc-mingw32/i386 (32-bit)
> > > 
> > > locale: [1] LC_COLLATE=French_France.1252 
> > > LC_CTYPE=French_France.1252    LC_MONETARY=French_France.1252
> > > [4] LC_NUMERIC=C                   LC_TIME=French_France.1252
> > > 
> > > attached base packages: [1] grDevices datasets  splines
> > > graphics stats     tcltk     utils     methods   base
> > > 
> > > other attached packages: [1] affy_1.28.0     Biobase_2.10.0 
> > > svSocket_0.9-50 TinnR_1.0.3     R2HTML_2.2      Hmisc_3.8-3 
> > > survival_2.35-8
> > > 
> > > loaded via a namespace (and not attached): [1] affyio_1.18.0 
> > > cluster_1.13.1        grid_2.12.0           lattice_0.19-13 
> > > preprocessCore_1.12.0 [6] svMisc_0.9-60         tools_2.12.0
> > > 
> > > . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
> > > . . . Wolfgang Raffelsberger, PhD Laboratoire de BioInformatique
> > > et Génomique Intégratives IGBMC, 1 rue Laurent Fries,  67404
> > > Illkirch Strasbourg,  France Tel (+33) 388 65 3300         Fax
> > > (+33) 388 65 3276 wolfgang.raffelsberger @ igbmc.fr
> > > 
> > > 
> > > [[alternative HTML version deleted]]
> > > 
> > > 
> > > 
> > > 
> > > _______________________________________________ Bioconductor 
> > > mailing list Bioconductor@stat.math.ethz.ch 
> > > https://stat.ethz.ch/mailman/listinfo/bioconductor Search the 
> > > archives: 
> > > http://news.gmane.org/gmane.science.biology.informatics.conductor
> > 
> > 
> > 
> > > 
-- Computational Biology Fred Hutchinson Cancer Research Center 1100
> > Fairview Ave. N. PO Box 19024 Seattle, WA 98109
> > 
> > Location: M1-B861 Telephone: 206 667-2793
> 
> 
> -- Computational Biology Fred Hutchinson Cancer Research Center 1100
> Fairview Ave. N. PO Box 19024 Seattle, WA 98109
> 
> Location: M1-B861 Telephone: 206 667-2793
> 
> _______________________________________________ Bioconductor mailing
> list Bioconductor@stat.math.ethz.ch 
> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the
> archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

_______________________________________________
Bioconductor mailing list
Bioconductor@stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: \
http://news.gmane.org/gmane.science.biology.informatics.conductor


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic