[BioC] plotting a CA
aedin culhane
aedin at jimmy.harvard.edu
Fri Mar 9 23:20:30 CET 2012
Hi Aoife
xax and yax are the axes, so xax =1 and yax =2 plots the first 2
components (or axes)
Aedin
On 03/09/2012 05:09 PM, aoife doherty wrote:
> O wow i was way off. Many thanks.
> May i ask one question (I'm a total newbie), i was trying out the
> different pieces of (much appreciated) code because i want to play
> around with them and make sure i understand them.
>
> But i have never used a function in R.
>
> For this section:
>
> xax =1, yax = 2, .. of this line
> plotCA<-function(dudi, rowFac, cols, plotgroups=FALSE,
> plotrowLabels=FALSE, pch=c(1:levels(rowFac))+10, xax =1, yax = 2, ...) {
>
> may i just ask what they represent?
>
> I am trying to work out how everything works by copy and pasting each
> line into R, and then seeing what happens, but for that line i keep getting:
>
> > plotCA(codonCA, rowFac=fac,pch=c(18,20), cols=c("red", "blue"))
> Error in scatterutil.base(dfxy = dfxy, xax = xax, yax = yax, xlim =
> xlim, :
> Non convenient selection for xax
> > plotCA(codonCA, rowFac=fac,pch=c(18,20), cols=c("red", "blue"),
> plotrowLabels=TRUE)
> Error in scatterutil.base(dfxy = dfxy, xax = xax, yax = yax, xlim =
> xlim, :
> Non convenient selection for xax
> > plotCA(codonCA, rowFac=fac, plotgroups=TRUE, cols=c("red", "blue"))
> Error in `[.data.frame`(dfxy, , xax) : undefined columns selected
>
> Deeply indebted.
> Aoife
>
> On Fri, Mar 9, 2012 at 5:49 PM, aedin culhane <aedin at jimmy.harvard.edu
> <mailto:aedin at jimmy.harvard.edu>> wrote:
>
> Hi Tim, Aoife and Susan
>
> Sorry Tim, I didn't know that I said not to use made4. When did I
> say this? I may have said I need to update some of the functions as
> I wrote the made4 package many years ago.
>
> Susan, made4 calls ade4 but is designed to convert microarray and
> other Bioconductor data classes into formats that can be input into
> ade4. It calls ade4 (and other) plot functions but with more
> sensible defaults for genomics data (ie it doesn't label all of the
> objects!). When I implemented the package I did it with Guy and
> Jean who wrote the paper you cited and I wholeheartedly agree with
> all you say ;-)
>
>
> However Aoife your code plot(ca(table,suprow=c(4,5))) can't be used
> for what you want. This will plot rows 4 and 5 as supplementary
> plots onto the plot. These points won't be used in the computation
> of the analysis and thus would provide what you want. Have a look
> at these plots
>
> ### ------------------------------__--------------
> ## From here, you can copy/paste everything to R
> ##----------------------------__--------------------
>
>
> ## Your data... I renamed it, as table is a function in R
>
> codonData <- matrix(c(4, 7, 0.2, 3, .1, 7, 222, 3, 10, 5, 11, 8, 8,
> 10, 7), ncol=3, dimnames = list(c("gene1","gene2", "gene3",
> "gene4", "gene5"), c("codon1", "codon2","codon3")))
>
> library(ca)
> codonCA<-ca(codonData)
>
> ## Draw 2 plots, one with results of analysis of all the data,
> # the other as you described
>
> par(mfrow=c(1,2))
> plot(ca(codonData,suprow=c(4,__5)))
> plot(codonCA)
>
> ## You will notice that the 2 plots are very different,
> ## one analysis is a CA of all 5 rows, the other is only 3 rows.
>
>
> ## To run a CA on a dataset using made4 or ade4, use the following code
>
> ## install made4
> ## source("http://bioconductor.__org/biocLite.R
> <http://bioconductor.org/biocLite.R>")
> ## biocLite("made4")
>
> library(made4)
>
> ## example dataset
> data(khan)
> df<-khan$train
>
> ## The function ord will run PCA, CA or NSC,
> ## by default it runs CA (by calling dudi.coa from ade4)
>
> myCA<- ord(df)
> plot(myCA)
> plotgenes(myCA)
> plotarrays(myCA)
>
>
> ## using the ade4 library
> library(ade4)
> codonCA<-dudi.coa(codonData, scan=FALSE)
> scatter(codonCA)
>
>
> ## However neither of these will do exactly as you wish
> ## made4 expects groups in the column not the rows (genes x samples)
>
> library(made4)
> codonCA<-ord(t(codonData))
>
> ## Create a factor which list the groups of "nodes" of interest
> fac<-factor(c(rep("Node1",3), rep("Node2", 2)))
> fac
> plot(codonCA, , classvec=fac)
>
>
>
> ## but the function below will do what you need.
>
>
> plotCA<-function(dudi, rowFac, cols, plotgroups=FALSE,
> plotrowLabels=FALSE, pch=c(1:levels(rowFac))+10, xax =1, yax = 2,
> ...) {
>
> require(made4)
>
> fac2char<-function(fac, newLabels) {
> cLab<- class(newLabels)
> if (!length(levels(fac))==length(__newLabels)) stop("Number
> does not equal to number of factor levels")
> vec<-as.character(factor(fac, labels=newLabels))
> if(inherits(newLabels, "numeric")) vec<-as.numeric(vec)
> return(vec)
> }
>
>
> if (plotgroups) s.groups(dudi$li, fac, col=cols)
> if (!plotgroups) {
> pchs<-fac2char(rowFac, pch)
> cols<-fac2char(rowFac, cols)
>
>
> if (!plotrowLabels) s.var(dudi$li, boxes=FALSE, pch=pchs,
> col=cols, cpoint=2, clabel=0, xax=xax, yax=yax, ...)
> if (plotrowLabels) s.var(dudi$li, boxes=FALSE, col=cols,
> xax=xax, yax=yax, ...)
> }
>
> s.var(dudi$co, boxes=FALSE, pch=19, col="black", add.plot = TRUE,
> xax=xax, yax=yax, ...)
> }
>
> ##----------------------------__----------------
> ## Examples: Function has 3 different options
> ##----------------------------__---------------
>
> library(ade4)
> codonCA<-dudi.coa(codonData, scan=FALSE)
>
> ## Option 1, plot a biplot (cases and samples) with point
> ## colored by rowFAC
>
> plotCA(codonCA, rowFac=fac,pch=c(18,20), cols=c("red", "blue"))
>
> ## Option 2. Same plot as above, but with labels rather than points
>
> plotCA(codonCA, rowFac=fac,pch=c(18,20), cols=c("red", "blue"),
> plotrowLabels=TRUE)
>
> ## Option 3, Same plot but put a circle around the groups
> ## If you look at the help page for s.groups (in made4)
> ## which calls s.class (in ade4) you will see you can also
> ## change the size and other details about the
> ## ellipse (or circle drawn around the groups)
>
> plotCA(codonCA, rowFac=fac, plotgroups=TRUE, cols=c("red", "blue"))
>
>
>
>
>
> On Thu, Mar 8, 2012 at 9:20 AM, aoife doherty
> <aoife.m.doherty at gmail.com <mailto:aoife.m.doherty at gmail.com>>__wrote:
>
> > Many thanks. I tried this:
> >
> > table <- structure(c(4, 7, 0.2, 3, .1, 7, 222, 3, 10, 5, 11,
> > 8, 8, 10, 7), .Dim = c(5L, 3L), .Dimnames = list(c("gene1",
> > "gene2", "gene3", "gene4", "gene5"), c("codon1", "codon2",
> > "codon3")))
> >
> > library(ca)
> >
> > plot(ca(table,suprow=c(4,5)))
> >
> > This will give me a ca plot, where the nodes of interest 4,5 are open
> > circles.
> >
> > However i have two questions.
> >
> > 1. Is it possible instead of manually typing in 4 and 5 to
> somehow get R to
> > read in a list of nodes of interest. Basically is it possible to
> change:
> >
> > c(4,5) to c(all the nodes that are in a file)
> >
> > and
> >
> > 2. Is it possible instead of the individual nodes of interest
> being open
> > circles, if the area encompassing all the nodes of interest could
> be shaded
> > differently/highlighted.
> > i THINK this is where your suggestion of:
> >
> > Your best bet is to use the package ade4
> > using res=dudi.coa(data)
> > then
> > s.class(res$li,group)
> > where group is your grouping variable you want to highlight.
> >
> > comes in, but i am completely new at R, i have genuinely tried to
> > understand the packages from the manual, I am confused however.
> >
> > Aoife
> >
> >
> >
> >
> >
>
> --
> Aedin Culhane
> Computational Biology and Functional Genomics Laboratory
> Harvard School of Public Health,
> Dana-Farber Cancer Institute
>
> web: http://www.hsph.harvard.edu/__research/aedin-culhane/
> <http://www.hsph.harvard.edu/research/aedin-culhane/>
> email: aedin at jimmy.harvard.edu <mailto:aedin at jimmy.harvard.edu>
> phone: +1 617 632 2468 <tel:%2B1%20617%20632%202468>
> Fax: +1 617 582 7760 <tel:%2B1%20617%20582%207760>
>
>
> Mailing Address:
> Attn: Aedin Culhane, SM822C
> 450 Brookline Ave.
> Boston, MA 02215
>
>
--
Aedin Culhane
Computational Biology and Functional Genomics Laboratory
Harvard School of Public Health,
Dana-Farber Cancer Institute
web: http://www.hsph.harvard.edu/research/aedin-culhane/
email: aedin at jimmy.harvard.edu
phone: +1 617 632 2468
Fax: +1 617 582 7760
Mailing Address:
Attn: Aedin Culhane, SM822C
450 Brookline Ave.
Boston, MA 02215
More information about the Bioconductor
mailing list