[BioC] [R] help with linear model
Petr PIKAL
petr.pikal at precheza.cz
Mon Oct 26 12:31:27 CET 2009
r-help-bounces at r-project.org napsal dne 26.10.2009 11:31:26:
> Thank you all for your replies. I have tried transposing my data and
before
> but I did not mention it because I was getting the same error. In the
> present case though it worked because I put
> >lm1=lm(*norm~*.,data=t(data))
> instead of
> >lm1=lm(*fm1*, data=t(data))
> where *fm1=norm~cols...*
There shall not be any difference. I suspect that your formula definition
has superfluous commas and/or t(data) change names which you suppose to be
e.g. 206427_s_at but it can not be valid name.
look at
head(t(data))
how names are changed. You need to change your formula according to names.
Regards
Petr
> I actually didn't know that there exists such a difference between
norm~cols
> and norm~.
> I wonder why...
>
> Thank you all again!
> Best,
> Eleni
>
> On Mon, Oct 26, 2009 at 12:24 PM, Petr PIKAL <petr.pikal at precheza.cz>
wrote:
>
> > Hi
> >
> >
> > r-help-bounces at r-project.org napsal dne 26.10.2009 10:48:51:
> >
> > > Dear list,
> > >
> > > I have been searching for a week to fit a simple linear model to my
> > data. I
> > > have looked into the previous posts but I haven't found anything
> > relevant to
> > > my problem. I guess it is something simple...I just cannot see it.
> > > I have the following data frame, named "data", which is a subset of
a
> > > microarray experiment. The columns are the samples and the rows are
the
> > > probes. I binded the first line, called "norm", which represents the
> > > estimated output. I want to create a linear model which shows the
> > > relationship between the gene expressions (rows) and the output
(norm).
> > >
> > > *data*
> > > GSM276723.CEL GSM276724.CEL GSM276725.CEL GSM276726.CEL
> > > norm 0.897000 0.590000 0.683000 0.949000
> > > 206427_s_at 5.387205 6.036506 8.824783 10.864122
> > > 205338_s_at 6.454779 13.143095 6.123212 12.726562
> > > 209848_s_at 6.703062 7.783330 12.175654 9.339651
> > > 205694_at 5.894131 5.794516 12.876555 11.534664
> > > 201909_at 12.616538 12.913255 12.275182 12.767743
> > > 208894_at 13.049286 9.317874 12.873516 13.527182
> > > 216512_s_at 6.324789 12.783791 6.216932 12.013404
> > > 205337_at 6.175940 12.158796 6.117519 12.041078
> > > 201850_at 6.633013 6.465900 6.535434 7.749985
> > > 210982_s_at 12.444791 8.597388 12.197696 12.963449
> > > GSM276727.CEL GSM276728.CEL GSM276729.CEL GSM276731.CEL
> > > norm 0.302000 0.597000 0.270000 0.530000
> > > 206427_s_at 5.690357 8.014055 13.034753 5.493977
> > > 205338_s_at 5.757048 7.706341 13.258410 5.562588
> > > 209848_s_at 6.461028 7.036515 13.633649 5.874098
> > > 205694_at 5.519552 5.297107 6.498811 5.146150
> > > 201909_at 12.814454 11.592632 6.594229 6.650796
> > > 208894_at 13.835359 13.028096 5.839909 6.045578
> > > 216512_s_at 6.033096 7.273650 12.669054 5.946932
> > > 205337_at 5.879028 7.381713 12.633829 5.379559
> > > 201850_at 9.684397 6.560014 8.523229 6.573052
> > > 210982_s_at 13.342729 12.470517 5.903681 5.658115
> > > GSM276732.CEL GSM276735.CEL GSM276736.CEL GSM276737.CEL
> > > norm 0.43400 0.647000 0.113000 1.000000
> > > 206427_s_at 12.80257 5.645002 6.519554 13.572480
> > > 205338_s_at 13.38057 5.804107 11.090690 14.024922
> > > 209848_s_at 13.27718 6.490851 9.784199 14.101162
> > > 205694_at 11.37717 5.802105 7.944963 14.060492
> > > 201909_at 13.24126 12.263899 12.578315 6.443491
> > > 208894_at 12.29916 7.563361 9.971493 7.094214
> > > 216512_s_at 13.00303 5.905789 10.512761 13.647573
> > > 205337_at 12.63560 5.430138 10.707242 13.020312
> > > 201850_at 12.71874 6.275480 6.987962 12.354580
> > > 210982_s_at 11.53559 7.225199 9.322706 6.617615
> > > GSM276738.CEL GSM276739.CEL GSM276740.CEL GSM276742.CEL
> > > norm 0.35700 0.967000 0.823000 1.000000
> > > 206427_s_at 13.33764 13.607918 13.190551 12.387189
> > > 205338_s_at 13.65492 12.812950 12.237476 12.912605
> > > 209848_s_at 13.48525 13.435389 13.851347 12.540495
> > > 205694_at 7.70928 10.045331 13.391456 11.103841
> > > 201909_at 12.47093 11.937344 6.631023 7.160071
> > > 208894_at 12.20508 8.892181 6.478889 5.927860
> > > 216512_s_at 13.42313 12.151691 11.620552 12.341763
> > > 205337_at 12.67544 12.036528 11.641203 12.275845
> > > 201850_at 11.85481 13.172666 12.964316 12.156142
> > > 210982_s_at 11.49940 8.380404 6.121762 5.921634
> > > GSM276743.CEL GSM276744.CEL GSM276745.CEL GSM276747.CEL
> > > norm 0.899000 0.927000 0.754000 0.437000
> > > 206427_s_at 12.665097 12.604673 11.446630 13.000295
> > > 205338_s_at 13.261141 12.448096 13.185698 12.510952
> > > 209848_s_at 13.396711 13.882529 13.040600 12.984137
> > > 205694_at 10.888474 7.094063 8.630120 12.321685
> > > 201909_at 12.100560 6.666787 12.330600 6.572282
> > > 208894_at 7.741437 8.348155 10.106442 6.009902
> > > 216512_s_at 12.830373 11.504074 12.300163 11.525958
> > > 205337_at 12.264569 11.676281 11.940917 11.618351
> > > 201850_at 11.055564 12.202366 7.327056 12.853055
> > > 210982_s_at 7.285289 8.129298 9.577032 5.924993
> > > GSM276748.CEL GSM276752.CEL GSM276754.CEL GSM276756.CEL
> > > norm 0.321000 0.620000 0.155000 0.946000
> > > 206427_s_at 9.081283 11.446978 8.191261 13.192507
> > > 205338_s_at 13.737773 13.698520 12.983830 10.948681
> > > 209848_s_at 13.234025 12.956672 10.644642 13.176656
> > > 205694_at 7.953865 7.397013 7.170732 13.618932
> > > 201909_at 12.533684 7.049442 6.804030 7.135974
> > > 208894_at 11.868729 8.558455 6.629858 6.850639
> > > 216512_s_at 13.589290 12.781853 12.060414 10.143297
> > > 205337_at 13.084386 12.442617 12.104849 10.364035
> > > 201850_at 6.615453 8.104145 7.058739 6.514298
> > > 210982_s_at 11.058085 7.891520 6.516261 6.532226
> > > GSM276758.CEL GSM276759.CEL
> > > norm 0.767000 0.218000
> > > 206427_s_at 5.742074 11.232337
> > > 205338_s_at 6.375289 13.406557
> > > 209848_s_at 6.226996 6.835458
> > > 205694_at 5.864042 11.218719
> > > 201909_at 6.907489 7.316435
> > > 208894_at 12.596987 12.408412
> > > 216512_s_at 6.308256 12.318892
> > > 205337_at 6.063775 12.389912
> > > 201850_at 6.816491 6.602764
> > > 210982_s_at 11.985288 11.853911
> > >
> > > *What I did is the following:*
> > > >fm1=as.formula((norm) ~ "206427_s_at" + "205338_s_at" +
"209848_s_at" +
> > > "205694_at" + "201909_at" + "208894_at" + "216512_s_at" +
"205337_at" +
> > > "201850_at" + "210982_s_at")
> > > >lm1=lm(fm1,data1new)
> > >
> > > And I receive the following error:
> > > Error in terms.formula(formula, data = data) :
> > > invalid model formula in ExtractVars
> > >
> > >
> > > *I have also tried:*
> > > >cols=rownames(data3) %%%%Where data3 is the same data frame with
data
> > > above, but without the "norm" row binded yet
> > > thus: > cols
> > > [1] "206427_s_at" "205338_s_at" "209848_s_at" "205694_at"
"201909_at"
> > > [6] "208894_at" "216512_s_at" "205337_at" "201850_at"
"210982_s_at"
> > >
> > > > lm1=lm(fm1,data1new)
> > >
> > > and in this case Ireceive the following error:
> > > Error in model.frame.default(formula = fm1, data = data1new,
> > > drop.unused.levels = TRUE) :
> > > variable lengths differ (found for 'cols')
> > >
> > > Could anyone help me with this?
> >
> > Usual expectation is that data are arranged columnwise. Each column is
a
> > variable and each row is an observation. So you shall transform your
data
> > to this form e.g. by
> >
> > t(yourdata).
> >
> > Other issue can be if your data are really numeric what you can test
by
> >
> > str(yourdata)
> >
> > which shall show a structure of your data.
> > If everything is OK than
> >
> > lm(norm ~ . , data = data1new) shall produce linear model of norm on
all
> > other columns in data frame data1new)
> >
> > Regards
> > Petr
> >
> > >
> > > Thank you very much in advance,
> > > Eleni
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > R-help at r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> >
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the Bioconductor
mailing list