[R] Calculating proportions from a data frame rather than a table
Farrel Buchinsky
fjbuch at gmail.com
Thu Oct 4 00:13:34 CEST 2007
Thank you. It comes close but not exactly what I wanted. I had to
scrap my column that contained character values. That column noted the
name of the study. Let me try show you here
Best if viewed in courier font
> coinfection
study HPV6 HPV11 CoInfect other
1 Wiatrak 2004 31 23 4 0
2 Draganov 2006 6 14 3 0
3 Gabbott 1997 19 24 1 0
4 Gerein 2005 17 14 0 7
5 Michael 2005 8 5 0 1
6 Rabah 2001 29 32 0 0
7 Maloney 2006 4 4 7 0
> str(coinfection)
'data.frame': 7 obs. of 5 variables:
$ study : chr "Wiatrak 2004" "Draganov 2006" "Gabbott 1997"
"Gerein 2005" ...
$ HPV6 : num 31 6 19 17 8 29 4
$ HPV11 : num 23 14 24 14 5 32 4
$ CoInfect: num 4 3 1 0 0 0 7
$ other : num 0 0 0 7 1 0 0
I had tried the following and was getting nowhere
> as.table(coinfection)
Error in as.table.default(coinfection) : cannot coerce into a table
> as.table(coinfection[,-1])
Error in as.table.default(coinfection[, -1]) :
cannot coerce into a table
Thanks to you was able to make some progress.
> as.table(as.matrix(coinfection))
study HPV6 HPV11 CoInfect other
1 Wiatrak 2004 31 23 4 0
2 Draganov 2006 6 14 3 0
3 Gabbott 1997 19 24 1 0
4 Gerein 2005 17 14 0 7
5 Michael 2005 8 5 0 1
6 Rabah 2001 29 32 0 0
7 Maloney 2006 4 4 7 0
SO FAR THIS LOOKS GOOD BUT THEN LOOK
> prop.table(as.table(as.matrix(coinfection)),1)#the main reason for doing this
Error in sum(..., na.rm = na.rm) : invalid 'type' (character) of argument
> prop.table(as.table(as.matrix(coinfection[,-1])),1)#this is to get rid of the variable called "study"
HPV6 HPV11 CoInfect other
1 0.53448276 0.39655172 0.06896552 0.00000000
2 0.26086957 0.60869565 0.13043478 0.00000000
3 0.43181818 0.54545455 0.02272727 0.00000000
4 0.44736842 0.36842105 0.00000000 0.18421053
5 0.57142857 0.35714286 0.00000000 0.07142857
6 0.47540984 0.52459016 0.00000000 0.00000000
7 0.26666667 0.26666667 0.46666667 0.00000000
WORKS PERFECTLY, EXACTLY WHAT I WANTED EXCEPT I HAVE LOST THE NAME OF
THE STUDY AND HAVE TO GO BACK TO LOOK AT WHICH DATA BELONGS TO WHICH
STUDY. THIS WOULD NOT HAVE HAPPENED IF I HAD THE DATA IN ITS RAWEST
FORM: A TWO COLUMN DATA FRAME WHERE COLUMN ONE WAS THE STUDY AND
COLUMN 2 WAS A FACTOR (LEVELS BEING hpv 6, hpv 11, coinfection,
other). SUCH A DATA FRAME WOULD HAVE HAD 253 rows. Then I could have
used table(column1,column2) and I could have got all this data as a
table and the study name would be preserved. It is not that big a deal
that I have to look elsewhere to find the study name but it seems
silly that I cannot analyze data that is not in the raw state. I am
sure there is a way. I just do not know it.
On 10/3/07, Rolf Turner <r.turner at auckland.ac.nz> wrote:
>
> I think that what you need to do is
>
> as.table(as.matrix(dff))
>
> E.g.
>
> melvin <- data.frame(x=c(3,1,3,2),y=c(3,3,4,5))
> clyde <- as.table(as.matrix(melvin))
> prop.table(clyde,1)
>
> x y
> A 0.5000000 0.5000000
> B 0.2500000 0.7500000
> C 0.4285714 0.5714286
> D 0.2857143 0.7142857
>
> HTH.
>
> cheers,
>
> Rolf Turner
>
> ######################################################################
> Attention:
> This e-mail message is privileged and confidential. If you are not the
> intended recipient please delete the message and notify the sender.
> Any views or opinions presented are solely those of the author.
>
> This e-mail has been scanned and cleared by MailMarshal
> www.marshalsoftware.com
> ######################################################################
>
--
Farrel Buchinsky
GrandCentral Tel: (412) 567-7870
More information about the R-help
mailing list