[BioC] Need help: no MTC possible

James W. MacDonald jmacdon at uw.edu
Tue Oct 16 18:07:33 CEST 2012


Hi Suparna,

On 10/16/2012 11:57 AM, suparna mitra wrote:
> Dear James,
>  Thanks for the support. But after doing these step also still no 
> significant genes (see attached ven diagram as all 0). I realize my 
> data is very variable. But isn't there any fix?

As disheartening as this is, there is no magic bullet to 'fix' your results.

You should note two things, however. First, there is a difference 
between statistical difference and no differences. There may be some 
genes in the top N that your collaborators might be interested in, given 
the goals of the experiment. Remember that a microarray experiment isn't 
the end - it is the beginning. You still have some hints here that might 
end up being validated, and could make a compelling story.

Second, just because individual gene comparisons are not working out 
doesn't mean that there aren't sets of genes that are significant. You 
can always do gene set analyses (as I already recommended, IIRC). Note 
that there are literally thousands of gene sets these days, so you might 
be better off selecting a few that are biologically plausible rather 
than just trying them all. See ?romer or ?roast.

Best,

Jim


> Thanks a lot,
> Suparna.
>
> > topTable(fit2, coef = 1, adjust = "fdr")
>
>            ID      logFC   AveExpr         t      P.Value adj.P.Val   
>        B
>
> 6238  7917530  0.6251124 11.170012  5.592012 2.536230e-05 0.6518774 
> -0.0171572
>
> 11556 7970507  0.9123944  7.490579  5.057525 7.969967e-05 0.6518774 
> -0.5140431
>
> 15234 8007228  0.6400697  9.710164  4.854888 1.239777e-04 0.6518774 
> -0.7143350
>
> 8819  7943047 -0.3189082  4.177681 -4.755607 1.541497e-04 0.6518774 
> -0.8148014
>
> 9675  7950951 -0.3189082  4.177681 -4.755607 1.541497e-04 0.6518774 
> -0.8148014
>
> 18889 8043581 -0.3189082  4.177681 -4.755607 1.541497e-04 0.6518774 
> -0.8148014
>
> 19899 8053785 -0.3189082  4.177681 -4.755607 1.541497e-04 0.6518774 
> -0.8148014
>
> 6239  7917532  0.7207256 10.449577  4.677368 1.831220e-04 0.6518774 
> -0.8950366
>
> 25845 8113130  0.6264274  8.957173  4.640816 1.984975e-04 0.6518774 
> -0.9328368
>
> 3759  7896284 -0.8301988  5.727302 -4.616141 2.096126e-04 0.6518774 
> -0.9584680
>
> > topTable(fit2, coef = 2, adjust = "fdr")
>
>            ID      logFC   AveExpr         t      P.Value adj.P.Val   
>       B
>
> 2088  7894602  0.5606499  2.841855  5.401484 3.800711e-05 0.9778984 
> -1.825861
>
> 685   7893190 -0.5281344  6.726990 -4.966811 9.708481e-05 0.9778984 
> -2.059423
>
> 6238  7917530 -0.5550876 11.170012 -4.786129 1.441524e-04 0.9778984 
> -2.162851
>
> 621   7893126 -0.6332961  4.412764 -4.785753 1.442714e-04 0.9778984 
> -2.163070
>
> 26642 8120756 -1.2198288  5.439265 -4.615075 2.101065e-04 0.9778984 
> -2.264206
>
> 1687  7894197 -1.0441762  2.631359 -4.526834 2.553958e-04 0.9778984 
> -2.317791
>
> 20947 8065084 -0.4297158  6.630412 -4.274936 4.470448e-04 0.9778984 
> -2.475534
>
> 154   7892657  0.9444466  3.997249  4.150578 5.900236e-04 0.9778984 
> -2.555951
>
> 20151 8056222 -0.8638926  7.892249 -4.144942 5.974996e-04 0.9778984 
> -2.559635
>
> 7851  7933619  0.4101773  8.480778  4.128728 6.195398e-04 0.9778984 
> -2.570249
>
> > topTable(fit2, coef = 3, adjust = "fdr")
>
>            ID      logFC  AveExpr         t      P.Value adj.P.Val     
>     B
>
> 6210  7917182 -0.3334646 3.273225 -5.847896 1.483281e-05 0.2755645 
> 2.2740621
>
> 27812 8132245 -0.5028082 5.409405 -5.795271 1.655191e-05 0.2755645 
> 2.1981127
>
> 2366  7894884  0.6507323 8.436001  5.335322 4.378172e-05 0.3234851 
> 1.5133904
>
> 26802 8122099 -0.4655070 4.548920 -5.279910 4.930640e-05 0.3234851 
> 1.4284277
>
> 587   7893092 -1.0604644 6.013864 -5.143614 6.613900e-05 0.3234851 
> 1.2172775
>
> 2562  7895081  0.6962641 6.898546  4.999306 9.045119e-05 0.3234851 
> 0.9904391
>
> 867   7893372  1.2334593 3.017891  4.971552 9.608676e-05 0.3234851 
> 0.9464374
>
> 685   7893190 -0.5196216 6.726990 -4.948033 1.011423e-04 0.3234851 
> 0.9090560
>
> 808   7893313  0.9743437 7.938503  4.893437 1.139496e-04 0.3234851 
> 0.8219582
>
> 15234 8007228  0.6486240 9.710164  4.801424 1.393931e-04 0.3234851 
> 0.6741646
>
> >
>
> > results <- decideTests(fit2)
>
> >
>
> > vennDiagram(results)
>
>
> On 16 October 2012 14:48, James W. MacDonald <jmacdon at uw.edu 
> <mailto:jmacdon at uw.edu>> wrote:
>
>     Hi Suparna,
>
>
>     On 10/16/2012 5:58 AM, suparna mitra wrote:
>
>         Hello group,
>         Related to my previous post, I further tried arrayweight as:
>
>         > f.invivo <- factor(InVivoTargets$Treatment, levels = c("A",
>         "R", "T"))
>
>         > design.invivo <- model.matrix(~0 + f.invivo)
>
>         > colnames(design.invivo) <- c("A", "R", "T")
>
>         > design.invivo
>
>            A R T
>
>         1  1 0 0
>
>         2  1 0 0
>
>         3  1 0 0
>
>         4  1 0 0
>
>         5  1 0 0
>
>         6  1 0 0
>
>         7  0 1 0
>
>         8  0 1 0
>
>         9  0 1 0
>
>         10 0 1 0
>
>         11 0 1 0
>
>         12 0 1 0
>
>         13 0 0 1
>
>         14 0 0 1
>
>         15 0 0 1
>
>         16 0 0 1
>
>         17 0 0 1
>
>         18 0 0 1
>
>         attr(,"assign")
>
>         [1] 1 1 1
>
>         attr(,"contrasts")
>
>         attr(,"contrasts")$f.invivo
>
>         [1] "contr.treatment"
>
>
>         >
>
>         > arrayw <- arrayWeightsSimple(rmaOligoinvivo, design.invivo)
>
>         > fit <- lmFit(rmaOligoinvivo, design.invivo, weights=arrayw)
>
>         > arrayw
>
>                 1         2         3         4         5         6  
>               7         8         9        10        11        12    
>            13        14
>
>         0.3749711 0.8578285 1.9289731 1.2390065 0.8116796 1.7846502
>         1.0741852 1.4277605 0.6533368 0.7637412 1.2647738 1.4520790
>         0.8309346 0.9328655
>
>                15        16        17        18
>
>         1.1926458 0.7280477 0.5130294 1.8503073
>
>         > contrast.matrix.invivo <- makeContrasts(R-A, T-R, T-A,levels
>         = design.invivo)
>
>         > fit2<-contrasts.fit(fit, contrast.matrix.invivo)
>
>         > fit2 <- eBayes(fit2)
>
>
>     Looks good to me.
>
>     Best,
>
>     Jim
>
>
>         >
>
>         Can anybody please suggest if I am doing it right? Actually
>         being new in this I am bit afraid to make errors.
>         Thanks,
>         Suparna.
>
>         On 16 October 2012 10:36, suparna mitra
>         <smitra at liverpool.ac.uk <mailto:smitra at liverpool.ac.uk>
>         <mailto:smitra at liverpool.ac.uk
>         <mailto:smitra at liverpool.ac.uk>>> wrote:
>
>             Dear James,
>               Thanks for your suggestion. I was reading arrayWeights
>         package
>             in limma.
>             But being novice in bioC I have one confusion. Should I
>             perform arrayWeights on normalized (rmaOligo) expression
>         data or
>             on the raw data?
>
>             This is what i have done so far:
>
>         > sessionInfo()
>             R version 2.15.1 (2012-06-22)
>             Platform: i386-apple-darwin9.8.0/i386 (32-bit)
>
>             locale:
>             [1]
>         en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
>
>             attached base packages:
>             [1] stats     graphics  grDevices utils     datasets
>          methods   base
>
>             other attached packages:
>              [1] statmod_1.4.15              limma_3.12.1            
>               annotate_1.34.1             hugene10stprobeset.db_8.0.1
>             org.Hs.eg.db_2.7.1
>              [6] BiocInstaller_1.4.7         affycoretools_1.28.0    
>               KEGG.db_2.7.1               GO.db_2.7.1                
>            AnnotationDbi_1.18.1
>             [11] affy_1.34.0                 Biobase_2.16.0          
>               BiocGenerics_0.2.0          pd.hugene.1.0.st.v1_3.6.0  
>            RSQLite_0.11.1
>             [16] DBI_0.2-5                   oligo_1.20.4            
>               oligoClasses_1.18.0
>
>
>              rmaOligoinvivo = oligo::rma(InVivodat1)
>             Background correcting
>             Normalizing
>             Calculating Expression
>
>         > maplot(rmaOligoinvivo)
>         >hist(rmaOligoinvivo)
>         >
>         InVivoTargets=readTargets("~/Desktop/Recent/Liverpool-work-related/Micro_RawData/InVivoTargets.txt")
>         > InVivoTargets
>              FileName Treatment
>             1   MC1       A
>             2   MC2       A
>             3   MC3       A
>             4   MC4       A
>             5   MC5       A
>             6   MC6       A
>             7   MC7       R
>             8   MC8        R
>             9   MC9        R
>             10 MC10        R
>             11 MC11        R
>             12 MC12        R
>             13 MC13       T
>             14 MC14        T
>             15 MC15        T
>             16 MC16        T
>             17 MC17        T
>             18 MC18        T
>
>             f.invivo <- factor(InVivoTargets$Treatment, levels =
>         c("A", "R", "T"))
>
>             design.invivo <- model.matrix(~0 + f.invivo)
>
>         > colnames(design.invivo) <- c("A", "R", "T")
>
>         > fit.invivo <- lmFit(rmaOligoinvivo, design.invivo)
>
>         > contrast.matrix.invivo <- makeContrasts(R-A, T-R, T-A,levels =
>             design.invivo)
>
>         > fit2.invivo <- contrasts.fit(fit.invivo, contrast.matrix.invivo)
>
>         > fit2.invivo <-eBayes(fit2.invivo)
>
>             Thanks a lot,
>             Suparna.
>
>
>             On 15 October 2012 14:33, James W. MacDonald
>         <jmacdon at uw.edu <mailto:jmacdon at uw.edu>
>         <mailto:jmacdon at uw.edu <mailto:jmacdon at uw.edu>>> wrote:
>
>                 Hi Suparna,
>
>
>                 On 10/15/2012 7:01 AM, suparna mitra wrote:
>
>                     Hi all,
>                        I have been working in a project where I have
>                     Affymetrix Hgene 1.0 St V1
>                     data. And I have tree groups of patients having 6
>         samples
>                     each. I tried to
>                     perform rma normalization and to filter my data
>         based on
>                     expression values
>                     20%. After that went for unpaired t-test to test
>         each two
>                     combination of
>                     groups. But the problem is my data is extremely
>         variable.
>                     I have tried to filter my genes based on variance
>         and/or
>                     CV before testing,
>                     to try to reduce the number of genes entering your
>         test
>                     and multiple
>                     correction.  But with different reasonable
>         filtering also
>                     I am with no
>                     luck. And I don't have the option to increase
>         sample size
>                     of my project.
>                     Further I tried to check for the bad samples and bad
>                     probes from
>                     experimentand remove outlier if these are not of
>         interest.
>                     Still the same
>                     when run t-test (and other possible test like
>                     Mann-Whitney) with MTC there
>                     are no genes.
>                     On the other hand if I go on with out MTC and select a
>                     good p value cutoff
>                     and reasonable fold change I get a list of significant
>                     gene which may be
>                     good or reasonable for my study. but the problem is I
>                     somehow need to
>                     justify the method for my finding. Do you know any
>         study
>                     or paper where
>                     anybody has treated their data without MTC?
>                     My main concern is if I find a good story matching
>                     biological prospective,
>                     would it be anyhow possible to justify the method
>         without MTC?
>
>
>                 It's not clear to me what you are doing here - when
>         you filter
>                 on variance are you keeping or removing the high
>         variability
>                 genes (keeping, I hope)? I am also not sure what MTC
>         stands
>                 for - is this multiple test correction?
>
>                 Anyway, assuming I have things correct, some suggestions.
>                 First, you might want to use array weights when
>         fitting your
>                 model. If you have a lot of intra-group variability,
>         this will
>                 tend to help.
>
>                 Second, the t-statistic is the universally most
>         powerful test
>                 (assuming the underlying data are relatively
>         hump-shaped), so
>                 going to a non-parametric test will usually reduce
>         rather than
>                 increase power to detect differences.
>
>                 Third, univariate tests are arguably not the most
>                 sophisticated way of analyzing expression data, and
>         you might
>                 get better (or at least more satisfactory) results if you
>                 instead looked at analyzing for groups of genes rather
>         than
>                 individually.
>
>                 Depending on your experiment, you could accomplish
>         this task
>                 with a gene set analysis (there are multiple ways of doing
>                 this - perhaps the easiest being romer() and roast() in
>                 limma), or if you have phenotypic data, especially
>         continuous
>                 measures, a WGCNA analysis might be of some use.
>
>                 Best,
>
>                 Jim
>
>
>                     Thanks a lot,
>                     Suparna.
>
>                             [[alternative HTML version deleted]]
>
>                     _______________________________________________
>                     Bioconductor mailing list
>         Bioconductor at r-project.org <mailto:Bioconductor at r-project.org>
>         <mailto:Bioconductor at r-project.org
>         <mailto:Bioconductor at r-project.org>>
>
>         https://stat.ethz.ch/mailman/listinfo/bioconductor
>                     Search the archives:
>         http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>                 --         James W. MacDonald, M.S.
>                 Biostatistician
>                 University of Washington
>                 Environmental and Occupational Health Sciences
>                 4225 Roosevelt Way NE, # 100
>                 Seattle WA 98105-6099
>
>
>
>
>             --     Dr. Suparna Mitra
>             Wolfson Centre for Personalised Medicine
>             Department of Molecular and Clinical Pharmacology
>             Institute of Translational Medicine University of Liverpool
>             Block A: Waterhouse Buildings,  L69 3GL Liverpool
>
>             Tel. +44 (0)151 795 5394
>         <tel:%2B44%20%280%29151%20795%205394>
>         <tel:%2B44%20%280%29151%20795%205394>,
>             Internal ext: 55394
>             M: +44 (0) 7511387895 <tel:%2B44%20%280%29%207511387895>
>         <tel:%2B44%20%280%29%207511387895>
>             Email id: smitra at liverpool.ac.uk
>         <mailto:smitra at liverpool.ac.uk> <mailto:smitra at liverpool.ac.uk
>         <mailto:smitra at liverpool.ac.uk>>
>             Alternative Email id: suparna.mitra.sm at gmail.com
>         <mailto:suparna.mitra.sm at gmail.com>
>         <mailto:suparna.mitra.sm at gmail.com
>         <mailto:suparna.mitra.sm at gmail.com>>
>
>
>
>
>
>         -- 
>         Dr. Suparna Mitra
>         Wolfson Centre for Personalised Medicine
>         Department of Molecular and Clinical Pharmacology
>         Institute of Translational Medicine University of Liverpool
>         Block A: Waterhouse Buildings,  L69 3GL Liverpool
>
>         Tel. +44 (0)151 795 5394
>         <tel:%2B44%20%280%29151%20795%205394>, Internal ext: 55394
>         M: +44 (0) 7511387895 <tel:%2B44%20%280%29%207511387895>
>         Email id: smitra at liverpool.ac.uk
>         <mailto:smitra at liverpool.ac.uk> <mailto:smitra at liverpool.ac.uk
>         <mailto:smitra at liverpool.ac.uk>>
>         Alternative Email id: suparna.mitra.sm at gmail.com
>         <mailto:suparna.mitra.sm at gmail.com>
>         <mailto:suparna.mitra.sm at gmail.com
>         <mailto:suparna.mitra.sm at gmail.com>>
>
>
>     -- 
>     James W. MacDonald, M.S.
>     Biostatistician
>     University of Washington
>     Environmental and Occupational Health Sciences
>     4225 Roosevelt Way NE, # 100
>     Seattle WA 98105-6099
>
>
>
>
> -- 
> Dr. Suparna Mitra
> Wolfson Centre for Personalised Medicine
> Department of Molecular and Clinical Pharmacology
> Institute of Translational Medicine University of Liverpool
> Block A: Waterhouse Buildings,  L69 3GL Liverpool
>
> Tel.  +44 (0)151 795 5394, Internal ext: 55394
> M: +44 (0) 7511387895
> Email id: smitra at liverpool.ac.uk <mailto:smitra at liverpool.ac.uk>
> Alternative Email id: suparna.mitra.sm at gmail.com 
> <mailto:suparna.mitra.sm at gmail.com>
>

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioconductor mailing list