[R] tm package
    David Neu 
    david at davidneu.com
       
    Tue Feb 16 02:57:06 CET 2010
    
    
  
Hi,
I'm using version 0.5.1 of tm package with R 2.10.1.  It looks to me
as if after the following
    reuters21578 <-  Corpus(DirSource(corpusDir), readerControl =
list(reader = readReut21578XMLasPlain))
    reuters21578 <- tm_map(reuters21578, stripWhitespace)
    reuters21578 <- tm_map(reuters21578, tolower)
    reuters21578 <- tm_map(reuters21578, removePunctuation)
    reuters21578 <- tm_map(reuters21578, removeNumbers)
    reuters21578.dtm <- DocumentTermMatrix(reuters21578)
that reuters21578.dtm does not include terms from the Heading (e.g. the Title).
I'm wondering if anyone can confirm this and if so, is there an option
to have the terms from the Heading included?
Many thanks!
Cheers,
David
    
    
More information about the R-help
mailing list