[BioC] getGEO and wilcox.test
Ovokeraye Achinike-Oduaran
ovokeraye at gmail.com
Tue Mar 20 13:37:28 CET 2012
Hi,
Sorry about the vagueness.
This is how I have retrieved my data from GEO. I'm trying to see the
DE of the genes across the two conditions (IR and IS). I just couldn't
figure out how to apply this info to wilcox.test()
gds157dat = getGEO('GDS157',destdir=".")
gds157eset = GDS2eSet(gds157dat, do.log2=TRUE)
groups= pData(gds157eset)$metabolism
groups=as.character(groups)
groups[groups=="insulin sensitive"]= "IS"
groups[groups=="insulin resistant"]= "IR"
sessionInfo()
R version 2.14.1 (2011-12-22)
Platform: i386-pc-mingw32/i386 (32-bit)
locale:
[1] LC_COLLATE=English_.1252 LC_CTYPE=English_.1252
[3] LC_MONETARY=English_.1252 LC_NUMERIC=C
[5] LC_TIME=English_.1252
attached base packages:
[1] stats4 splines stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] coin_1.0-21 modeltools_0.2-19 mvtnorm_0.9-9992
[4] survival_2.36-12 XML_3.9-4.1 RCurl_1.91-1.1
[7] bitops_1.0-4.1 puma_2.6.0 mclust_3.4.11
[10] limma_3.10.2 ArrayExpress_1.14.0 affy_1.32.1
[13] GEOquery_2.20.8 Biobase_2.14.0
loaded via a namespace (and not attached):
[1] affyio_1.22.0 BiocInstaller_1.2.1 preprocessCore_1.16.0
[4] zlibbioc_1.0.0
>
Regards,
Avoks
On Tue, Mar 20, 2012 at 2:15 PM, Sean Davis <sdavis2 at mail.nih.gov> wrote:
>
>
> On Tue, Mar 20, 2012 at 7:56 AM, Vincent Carey <stvjc at channing.harvard.edu>
> wrote:
>>
>> Please read the posting guide
>> http://www.bioconductor.org/help/mailing-list/posting-guide/ before
>> querying this list.
>>
>> You have not given any information on how you have used getGEO. To help
>> you, I issued
>>
>> > library(GEOquery)
>> Setting options('download.file.method.GEOquery'='auto')
>> > gg = getGEO("GDS157")
>> File stored at:
>>
>> /var/folders/4D/4DI98FkjGzq0K2niUTEHSE+++TM/-Tmp-//RtmpGnz9Cf/GDS157.soft.gz
>> > gg
>> An object of class "GDS"
>
>
> At this point, if you would like to work with an ExpressionSet instead of a
> GDS object, try:
>
> expset = GDS2eSet(gg)
>
> Sean
>
>>
>> channel_count
>> [1] "1"
>> dataset_id
>> [1] "GDS157" "GDS157"
>> description
>> [1] "Analysis of gene expression in pooled vastus lateralis muscle samples
>> from insulin-sensitive and insulin-resistant equally obese, non-diabetic
>> Pima Indians. A search for susceptibility genes for type 2 diabetes. "
>> ...
>>
>> > getClass("GDS")
>> Class "GDS" [package "GEOquery"]
>>
>> Slots:
>>
>> Name: gpl dataTable header
>> Class: GPL GEODataTable list
>>
>> Extends: "GEOData"
>> > getClass("GEODataTable")
>> Class "GEODataTable" [package "GEOquery"]
>>
>> Slots:
>>
>> Name: columns table
>> Class: data.frame data.frame
>>
>> Here I am using R's self-describing capacities to learn about what the
>> query returned.
>>
>> > gg at dataTable@columns
>> sample metabolism
>> 1 GSM2289 insulin resistant
>> 2 GSM2294 insulin resistant
>> 3 GSM2299 insulin resistant
>> 4 GSM2304 insulin resistant
>> 5 GSM2309 insulin resistant
>> 6 GSM2313 insulin sensitive
>> 7 GSM2318 insulin sensitive
>> 8 GSM2323 insulin sensitive
>> 9 GSM2328 insulin sensitive
>> 10 GSM2333 insulin sensitive
>>
>> description
>> 1 Value for GSM2289: insulin resistant sample pool 1 muscle on HuFL; src:
>> muscle
>> 2 Value for GSM2294: insulin resistant sample pool 2 muscle on HuFL; src:
>> muscle
>> 3 Value for GSM2299: insulin resistant sample pool 3 muscle on HuFL; src:
>> muscle
>> 4 Value for GSM2304: insulin resistant sample pool 4 muscle on HuFL; src:
>> muscle
>> 5 Value for GSM2309: insulin resistant sample pool 5 muscle on HuFL; src:
>> muscle
>> 6 Value for GSM2313: insulin sensitive sample pool 1 muscle on HuFL; src:
>> muscle
>> 7 Value for GSM2318: insulin sensitive sample pool 2 muscle on HuFL; src:
>> muscle
>> 8 Value for GSM2323: insulin sensitive sample pool 3 muscle on HuFL; src:
>> muscle
>> 9 Value for GSM2328: insulin sensitive sample pool 4 muscle on HuFL; src:
>> muscle
>> 10 Value for GSM2333: insulin sensitive sample pool 5 muscle on HuFL; src:
>> muscle
>>
>> Now I start to see that the collection of samples may be viewed as falling
>> into two classes. If you want to use wilcox.test to address a two-sample
>> problem arising from this experiment, you will have to use the information
>> shown above to distinguish numerical values on gene expression into the
>> classes. There is more than enough information in the above to begin this
>> process; for biological interpretation you need to know a little more: you
>> will need to know the GPL80 is documented in the package hu6800.db.
>>
>> On Tue, Mar 20, 2012 at 7:24 AM, Ovokeraye Achinike-Oduaran <
>> ovokeraye at gmail.com> wrote:
>>
>> > Hi all,
>> >
>> > I am not quite sure how to use the expression set I get from getGEO(),
>> > say gds157, in wilcox.test().
>> >
>> > Please help.
>> >
>> > Thanks.
>> >
>> > Avoks
>> >
>> > _______________________________________________
>> > Bioconductor mailing list
>> > Bioconductor at r-project.org
>> > https://stat.ethz.ch/mailman/listinfo/bioconductor
>> > Search the archives:
>> > http://news.gmane.org/gmane.science.biology.informatics.conductor
>> >
>>
>> [[alternative HTML version deleted]]
>>
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
More information about the Bioconductor
mailing list