[R] Ambiguities in vector
Gavin Simpson
gavin.simpson at ucl.ac.uk
Mon Oct 8 15:48:24 CEST 2007
On Mon, 2007-10-08 at 15:35 +0200, Birgit Lemcke wrote:
> Hello James,
>
> all of your suggestions work very well except of this:
>
> FemMal <- cbind(FemV1gezählt[2,], MalV1gezählt[2,])
>
> colnames(FemMal) <- ("Females", "Males")
> Fehler: syntax error
The OP missed out c() above, hence the syntax error.
> colnames(FemMal) <- c("Females", "Males")
> FemMal
Females Males
1 133 79
2 203 237
3 51 76
> But it works if I do that:
>
> Namen<-c("Female","Male")
> colnames(FemMal) <- (Namen)
^ ^
This is a bit redundant, unless you actually need Namen for something
else. You also don't need the "(" ")" around Namen in the second line
there.
G
> FemMal
>
> Female Male
> 1 133 79
> 2 203 237
> 3 51 76
>
> Greetings
>
> Birgit
>
>
>
> Am 04.10.2007 um 17:19 schrieb James Reilly:
>
> >
> > Hi Birgit,
> >
> > First, can I suggest that you don't copy off-list conversations to
> > the mailing list partway through? Not that I minded in this case,
> > but it probably confuses people and the posting guide warns against
> > it.
> >
> > I'll address your questions in reverse order.
> >
> > To get tables for each column, try:
> > apply(FemV1Test, 2, table)
> >
> > Likewise for males:
> > apply(MalV1, 2, table)
> >
> > To compare them, perhaps put them side by side:
> > FemMal <- cbind(apply(FemV1Test, 2, table)[2,], apply(MalV1, 2,
> > table)[2,])
> > colnames(FemMal) <- ("Females", "Males")
> > FemMal
> >
> > You can then do arithmetic, plot them, sort by the difference, etc.
> > plot(FemMal)
> > FemMal[order(FemMal[,1]-FemMal[,2]),]
> >
> > About crossprod, cell (i,j) in the resulting matrix shows the
> > number of cases with a 1 for attribute i and attribute j. This
> > shows which attributes overlap most and least.
> >
> > The command "tab <- tab - diag(diag(tab))" puts zeroes down the
> > diagonal, as was requested. One cosmetic reason for doing this is
> > that the diagonal elements are often much larger than the off-
> > diagonal ones, and zeroing them makes the table easier to read or
> > display graphically. E.g.
> > http://pbil.univ-lyon1.fr/ADE-4/ade4-html/table.dist.html
> >
> > Yes, any row with all NAs will make the crossprod all NAs too. You
> > can ignore any rows with NAs as follows:
> > CrossFemMal1_3<-crossprod(as.matrix(CrossFemMalVar1_3[apply
> > (CrossFemMalVar1_3, 1, function (x) !any(is.na(x))),]))
> >
> > I'm not sure if I follow why you want to know about statistical
> > significance here. Do you really think of the species in your study
> > as a sample from a larger population of plant species, which you
> > are trying to generalise about?
> >
> > If so, is the population much larger than your sample? And was your
> > sample of species selected randomly, i.e. with equal selection
> > probabilities? If not, standard tests probably won't apply.
> >
> > Regards,
> > James
> >
> >
> > On 2/10/07 2:44 AM, Birgit Lemcke wrote:
> >> Hello James,
> >> first I have to thank you for your help but there are some things
> >> I don´t understand now.
> >> I am not sur if I understand what this example gives me back:
> >> ratings <- data.frame(id = c(1,2,3,4), att1 = c(1,1,0,1), att2 = c
> >> (1,0,0,1), att3 = c(0,1,1,1))
> >> ratings
> >> id att1 att2 att3
> >> 1 1 1 1 0
> >> 2 2 1 0 1
> >> 3 3 0 0 1
> >> 4 4 1 1 1
> >> tab <- crossprod(as.matrix(ratings[,-1]))
> >> tab <- tab - diag(diag(tab))
> >> tab
> >> att1 att2 att3
> >> att1 0 2 2
> >> att2 2 0 1
> >> att3 2 1 0
> >> As I understood it gives me the number how often we find the same
> >> value for example comparing att1 and att2 for all id´s?. Is that
> >> right?
> >> What is this line doing: tab <- tab - diag(diag(tab))
> >> And what does the original output of crosspod mean:
> >> att1 att2 att3
> >> att1 3 2 2
> >> att2 2 2 1
> >> att3 2 1 3
> >> I tried to do this with a part of my dataset
> >> I used a table with 3 variables (only binary)
> >> In the first part of the table I have the females (348 rows) and
> >> in the second part the males (also 348 rows).
> >> Then I tried this:
> >> CrossFemMal1_3<-crossprod(as.matrix(CrossFemMalVar1_3))
> >> The output:
> >> CrossFemMal1_3
> >> V1 V2 V3
> >> V1 NA NA NA
> >> V2 NA NA NA
> >> V3 NA NA NA
> >> There was one row of NAs in my dataset. I presume this is
> >> responsible for the NA results? So how can I deal here with NAs?
> >> If I use two matrices (male and female) I get back amongst others
> >> the comparison of att1male to att1 female. In the case that I use
> >> the possibility of a percentage table output I get for example
> >> 40%. Can I say then that if the percentage is lower than 50% the
> >> attributes are significantly different?
> >> Corresponding to your other suggestion:
> >> sapply(c("1","2","3"), function(x) ifelse(regexpr(x, FemV1) > 0,
> >> 1, 0))
> >> It gives me this output
> >> 1 2 3
> >> [1,] 1 0 0
> >> [2,] 1 0 0
> >> [3,] 1 0 0
> >> [4,] 1 0 0
> >> [5,] 1 0 0
> >> [6,] 1 0 0
> >> [7,] 1 0 0
> >> [8,] 1 0 0
> >> [9,] 0 1 0
> >> . . . .
> >> . . . .
> >> I think now I should count the ones for 1, 2 and 3?
> >> I tried to use table but it gives me only the counts for 1 and zero:
> >> table(FemV1Test)
> >> FemV1Test
> >> 0 1
> >> 657 387
> >> How can I specify that it gives me the counts for every column?
> >> And then do the same for MalV1 and compare both somehow?
> >> Another time thanks in advance for your help.
> >> Greetings Birgit
> >> Am 29.09.2007 um 14:45 schrieb James Reilly:
> >>>
> >>> Hi Birgit,
> >>>
> >>> The first argument to regexpr should be just one character value,
> >>> not a vector. Your call:
> >>> regexpr(c("1","2","3"),FemV1)
> >>> seems to have been interpreted as:
> >>> regexpr("1",FemV1)
> >>>
> >>> I think you probably need something more like:
> >>> sapply(c("1","2","3"), function(x) ifelse(regexpr(x, FemV1) > 0,
> >>> 1, 0))
> >>> This will also work on multiple response data such as
> >>> FemV1 <- c("13", "2", "13", "123", "1", "23")
> >>> Then colSums will give you frequency counts for each attribute.
> >>>
> >>> I think you would need greatly simplify the multiple response
> >>> data to apply anything like a paired t-test. Have you considered
> >>> just crosstabulating the attributes of male plants versus the
> >>> females? For some R code, see
> >>> https://stat.ethz.ch/pipermail/r-help/2007-February/126125.html
> >>>
> >>> Regards,
> >>> James
> >>>
> >>>
> >>> On 29/9/07 3:37 AM, Birgit Lemcke wrote:
> >>>> Hello James,
> >>>> sorry that I have to ask you a second time but I don´t
> >>>> understand what regexpr () is doing and how the syntax works.
> >>>> I have a vectors that I converted to character string
> >>>> as.character(FalV1)
> >>>> [1] "1" "1" "1" "1" "1" "1" "1" "1" "2"
> >>>> after that I did this, but without knowing what I am really doing
> >>>> regexpr(c("1","2","3"),FemV1)
> >>>> The output looked like that
> >>>> [1] 1 1 1 1 1 1 1 1 -1 As i undertsood the function
> >>>> looks for in this case 1, 2 or 3. If there is a match it gives
> >>>> me back 1 if not it gives me back -1
> >>>> But I don´t know how this helps me now si I hope you will
> >>>> explain me.
> >>>> And there is another problem I have. cor the continous variables
> >>>> I used a paired T-Test can I perform this approach also paired?
> >>>> Thanks a lot in advance.
> >>>> Greetings
> >>>> Birgit
> >>>> Am 21.09.2007 um 11:38 schrieb James Reilly:
> >>>>>
> >>>>> If I understand you right, you have several multiple response
> >>>>> variables (with the responses encoded in numeric strings) and
> >>>>> you want to see whether these are associated with sex. To
> >>>>> tabulate the data, I would convert your variables into
> >>>>> collections of dummy variables using regexpr(), then use table
> >>>>> (). You can use a modified chi-squared test with a Rao-Scott
> >>>>> correction on the resulting tables; see Thomas and Decady
> >>>>> (2004). Bootstrapping is another possible approach.
> >>>>>
> >>>>> @article{,
> >>>>> Author = {Thomas, D. Roland and Decady, Yves J.},
> >>>>> Journal = {International Journal of Testing},
> >>>>> Number = {1},
> >>>>> Pages = {43 - 59},
> >>>>> Title = {Testing for Association Using Multiple Response Survey
> >>>>> Data: Approximate Procedures Based on the Rao-Scott Approach.},
> >>>>> Volume = {4},
> >>>>> Year = {2004},
> >>>>> Url=http://search.ebscohost.com/login.aspx?
> >>>>> direct=true&db=pbh&AN=13663214&site=ehost-live <http://
> >>>>> search.ebscohost.com/login.aspx?
> >>>>> direct=true&db=pbh&AN=13663214&site=ehost-live <http://
> >>>>> search.ebscohost.com/login.aspx?
> >>>>> direct=true&db=pbh&AN=13663214&site=ehost-live>>
> >>>>> }
> >>>>>
> >>>>> Hope this helps,
> >>>>> James
> >>>>> --
> >>>>> James Reilly
> >>>>> Department of Statistics, University of Auckland
> >>>>> Private Bag 92019, Auckland, New Zealand
> >>>>>
> >>>>> On 21/9/07 7:14 AM, Birgit Lemcke wrote:
> >>>>>> First thanks for your answer.
> >>>>>> Now I try to explain better:
> >>>>>> I have species in the rows and morphological attributes in
> >>>>>> the columns coded by numbers (qualitative variables; nominal
> >>>>>> and ordinal).
> >>>>>> In one table for the male plants of every species and in the
> >>>>>> other table for the female plants of every species. The
> >>>>>> variables contain every possible occurrence in this species
> >>>>>> and this gender.
> >>>>>> I would like to compare every variable between male and female
> >>>>>> plants for example using a ChiSquare Test.
> >>>>>> The Null-hypothesis could be: Variable male is equal to
> >>>>>> variable Female.
> >>>>>> The question behind all is, if male and female plants in this
> >>>>>> species are significantly different and which attributes are
> >>>>>> responsible for this difference.
> >>>>>> I really hope that this is better understandable. If not
> >>>>>> please ask.
> >>>>>> Thanks a million in advance.
> >>>>>> Greetings
> >>>>>> Birgit
> >>>>>
> >>>> Birgit Lemcke
> >>>> Institut für Systematische Botanik
> >>>> Zollikerstrasse 107
> >>>> CH-8008 Zürich
> >>>> Switzerland
> >>>> Ph: +41 (0)44 634 8351
> >>>> birgit.lemcke at systbot.uzh.ch <mailto:birgit.lemcke at systbot.uzh.ch>
> >> Birgit Lemcke
> >> Institut für Systematische Botanik
> >> Zollikerstrasse 107
> >> CH-8008 Zürich
> >> Switzerland
> >> Ph: +41 (0)44 634 8351
> >> birgit.lemcke at systbot.uzh.ch <mailto:birgit.lemcke at systbot.uzh.ch>
> >>
> >
> > --
> > James Reilly
> > Department of Statistics, University of Auckland
> > Private Bag 92019, Auckland, New Zealand
>
> Birgit Lemcke
> Institut für Systematische Botanik
> Zollikerstrasse 107
> CH-8008 Zürich
> Switzerland
> Ph: +41 (0)44 634 8351
> birgit.lemcke at systbot.uzh.ch
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Gavin Simpson [t] +44 (0)20 7679 0522
ECRC, UCL Geography, [f] +44 (0)20 7679 0565
Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/
UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
More information about the R-help
mailing list