[R] findAssocs()
rtw30606
rwatson at terry.uga.edu
Fri Jul 20 19:04:38 CEST 2012
Hi
Here is some code to illustrate how the correlations are calculated.
> data <- c("word1", "word1 word2","word1 word2 word3","word1 word2 word3
> word4","word1 word2 word3 word4 word5")
> frame <- data.frame(data)
> frame
data
1 word1
2 word1 word2
3 word1 word2 word3
4 word1 word2 word3 word4
5 word1 word2 word3 word4 word5
> test <- Corpus(DataframeSource(frame, encoding = "UTF-8"))
> dtm <- DocumentTermMatrix(test)
> as.matrix(dtm)
Terms
Docs word1 word2 word3 word4 word5
1 1 0 0 0 0
2 1 1 0 0 0
3 1 1 1 0 0
4 1 1 1 1 0
5 1 1 1 1 1
>
> findAssocs(dtm, "word2", 0.1)
word2 word3 word4 word5
1.00 0.61 0.41 0.25
> # Correlation word2 with word3
> cor(c(0,1,1,1,1),c(0,0,1,1,1))
[1] 0.6123724
> # Correlation word2 with word4
> cor(c(0,1,1,1,1),c(0,0,0,1,1))
[1] 0.4082483
> # Correlation word2 with word5
> cor(c(0,1,1,1,1),c(0,0,0,0,1))
[1] 0.25
Cheers
Rick
--
View this message in context: http://r.789695.n4.nabble.com/findAssocs-tp3845751p4637248.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list