[BioC] batch effects 450K
Teschendorff, Andrew
a.teschendorff at ucl.ac.uk
Fri Jun 8 17:03:38 CEST 2012
Hi Femke,
For COMBAT you do not need to specify a phenotype of interest. Read the original paper presenting COMBAT.
rgds
A
***********************************************************************************************************************************************
Andrew E Teschendorff PhD
Heller Research Fellow
Statistical Cancer Genomics
Paul O'Gorman Building
UCL Cancer Institute
University College London
72 Huntley Street
London WC1E 6BT, UK.
Tel: +44 (0)20 7679 0727
Mob: +44 (0)7876 561263
Email: a.teschendorff at ucl.ac.uk
http://www.ucl.ac.uk/cancer/rescancerbiol/statisticalgenomics
********************************************************************************************************************************************
________________________________________
From: bioconductor-bounces at r-project.org [bioconductor-bounces at r-project.org] on behalf of Femke [guest] [guest at bioconductor.org]
Sent: 08 June 2012 15:44
To: bioconductor at r-project.org; f.simmer at ncmls.ru.nl
Subject: [BioC] batch effects 450K
Dear All,
I have Infinium 450K data for 56 breast cancer tumors. As a first analysis I wanted to do a clustering and see the distribution of the samples. For this I used the minfi package. Unfortunately, the assays were done in 2 batches and there is a clear batch effect. I looked into Combat and SVA to remove the batch effect. As far as I understand, to use these approaches I need to have a phenotype/variable of interest. In the tutorial ("The SVA package for removing batch effects and other unwanted variation in high-throughput experiments … Modified: October 24, 2011 Compiled: April 25, 2012") the variable of interest is cancer status. However, I do not have normals. Does anyone have suggestions on how I should tackle these batch effects?
Many thanks in advance and all the best!
Femke
-- output of sessionInfo():
R version 2.15.0 (2012-03-30)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] C
attached base packages:
[1] grid stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] bladderbatch_1.0.3
[2] sva_3.2.1
[3] mgcv_1.7-17
[4] corpcor_1.6.3
[5] IlluminaHumanMethylation450kmanifest_0.2.1
[6] gplots_2.10.1
[7] KernSmooth_2.23-7
[8] caTools_1.13
[9] bitops_1.0-4.1
[10] gdata_2.8.2
[11] gtools_2.6.2
[12] minfi_1.2.0
[13] GenomicRanges_1.8.6
[14] IRanges_1.14.3
[15] reshape_0.8.4
[16] plyr_1.7.1
[17] lattice_0.20-6
[18] Biobase_2.16.0
[19] BiocGenerics_0.2.0
loaded via a namespace (and not attached):
[1] AnnotationDbi_1.18.1 BiocInstaller_1.4.4 Biostrings_2.24.1
[4] DBI_0.2-5 MASS_7.3-18 Matrix_1.0-6
[7] R.methodsS3_1.2.2 RColorBrewer_1.0-5 RSQLite_0.11.1
[10] affyio_1.24.0 annotate_1.34.0 beanplot_1.1
[13] bit_1.1-8 codetools_0.2-8 crlmm_1.14.0
[16] ellipse_0.3-7 ff_2.2-7 foreach_1.4.0
[19] genefilter_1.38.0 iterators_1.0.6 limma_3.12.0
[22] matrixStats_0.5.0 mclust_3.4.11 multtest_2.12.0
[25] mvtnorm_0.9-9992 nlme_3.1-104 nor1mix_1.1-3
[28] oligoClasses_1.18.0 preprocessCore_1.18.0 siggenes_1.30.0
[31] splines_2.15.0 stats4_2.15.0 survival_2.36-14
[34] xtable_1.7-0 zlibbioc_1.2.0
--
Sent via the guest posting facility at bioconductor.org.
_______________________________________________
Bioconductor mailing list
Bioconductor at r-project.org
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list