[BioC] batch effects 450K

Teschendorff, Andrew a.teschendorff at ucl.ac.uk
Fri Jun 8 17:03:38 CEST 2012


Hi Femke,
For COMBAT you do not need to specify a phenotype of interest. Read the original paper presenting COMBAT.
rgds
A








***********************************************************************************************************************************************
Andrew E Teschendorff PhD
Heller Research Fellow
Statistical Cancer Genomics
Paul O'Gorman Building
UCL Cancer Institute
University College London
72 Huntley Street
London WC1E 6BT, UK.

Tel: +44 (0)20 7679 0727
Mob: +44 (0)7876 561263
Email: a.teschendorff at ucl.ac.uk
http://www.ucl.ac.uk/cancer/rescancerbiol/statisticalgenomics
********************************************************************************************************************************************



________________________________________
From: bioconductor-bounces at r-project.org [bioconductor-bounces at r-project.org] on behalf of Femke [guest] [guest at bioconductor.org]
Sent: 08 June 2012 15:44
To: bioconductor at r-project.org; f.simmer at ncmls.ru.nl
Subject: [BioC] batch effects 450K

Dear All,

I have Infinium 450K data for 56 breast cancer tumors. As a first analysis I wanted to do a clustering and see the distribution of the samples. For this I used the minfi package. Unfortunately, the assays were done in 2 batches and there is a clear batch effect. I looked into Combat and SVA to remove the batch effect. As far as I understand, to use these approaches I need to have a phenotype/variable of interest. In the tutorial ("The SVA package for removing batch effects and other unwanted variation in high-throughput experiments … Modified: October 24, 2011 Compiled: April 25, 2012") the variable of interest is cancer status. However, I do not have normals. Does anyone have suggestions on how I should tackle these batch effects?

Many thanks in advance and all the best!

Femke


 -- output of sessionInfo():

R version 2.15.0 (2012-03-30)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
[1] C

attached base packages:
[1] grid      stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
 [1] bladderbatch_1.0.3
 [2] sva_3.2.1
 [3] mgcv_1.7-17
 [4] corpcor_1.6.3
 [5] IlluminaHumanMethylation450kmanifest_0.2.1
 [6] gplots_2.10.1
 [7] KernSmooth_2.23-7
 [8] caTools_1.13
 [9] bitops_1.0-4.1
[10] gdata_2.8.2
[11] gtools_2.6.2
[12] minfi_1.2.0
[13] GenomicRanges_1.8.6
[14] IRanges_1.14.3
[15] reshape_0.8.4
[16] plyr_1.7.1
[17] lattice_0.20-6
[18] Biobase_2.16.0
[19] BiocGenerics_0.2.0

loaded via a namespace (and not attached):
 [1] AnnotationDbi_1.18.1  BiocInstaller_1.4.4   Biostrings_2.24.1
 [4] DBI_0.2-5             MASS_7.3-18           Matrix_1.0-6
 [7] R.methodsS3_1.2.2     RColorBrewer_1.0-5    RSQLite_0.11.1
[10] affyio_1.24.0         annotate_1.34.0       beanplot_1.1
[13] bit_1.1-8             codetools_0.2-8       crlmm_1.14.0
[16] ellipse_0.3-7         ff_2.2-7              foreach_1.4.0
[19] genefilter_1.38.0     iterators_1.0.6       limma_3.12.0
[22] matrixStats_0.5.0     mclust_3.4.11         multtest_2.12.0
[25] mvtnorm_0.9-9992      nlme_3.1-104          nor1mix_1.1-3
[28] oligoClasses_1.18.0   preprocessCore_1.18.0 siggenes_1.30.0
[31] splines_2.15.0        stats4_2.15.0         survival_2.36-14
[34] xtable_1.7-0          zlibbioc_1.2.0


--
Sent via the guest posting facility at bioconductor.org.

_______________________________________________
Bioconductor mailing list
Bioconductor at r-project.org
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


More information about the Bioconductor mailing list