[R] Maximum number of patterns and speed in grep
mdvaan
mathijsdevaan at gmail.com
Fri Jul 13 19:41:12 CEST 2012
Here's some data (which should give you the error messages):
# read in data
data <- read.csv("https://dl.dropbox.com/u/13631687/data.csv", header =
T, sep = ",")
# first paste all data
data1 <- paste(data[,1], collapse = "|")
# second paste subsets of the data
data2a <- paste(data[1:750,1], collapse = "|")
data2b <- paste(data[751:1500,1], collapse = "|")
# define the object to be searched
text <- c("the first is Santa Fe Gold Corp", "the second is Starpharma
Holdings")
# match
strapplyc(text, data1)
strapplyc(text, data2a)
strapplyc(text, data2b)
Thanks in advance!
Math
Gabor Grothendieck wrote
>
> On Fri, Jul 13, 2012 at 9:40 AM, mdvaan <mathijsdevaan@> wrote:
>> Thanks, I see that it is working in the sample data. My data, however,
>> gives
>> me an error message:
>>
>> data <- strapplyc(text, batch[[l]])
>> Error in structure(.External("dotTcl", ..., PACKAGE = "tcltk"), class =
>> "tclObj") :
>> [tcl] couldn't compile regular expression pattern: parentheses () not
>> balanced.
>>
>> batch[[l]] is similar to your "re" string except that there is a larger
>> variety of characters. I haven't been able to figure out which characters
>> are causing trouble here. Any thoughts?
>>
>> Thank you very much.
>>
>> Math
> ...
>>
>> ______________________________________________
>> R-help@ mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> Note part on last line about posting reproducible code.
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
>
> ______________________________________________
> R-help@ mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
View this message in context: http://r.789695.n4.nabble.com/Maximum-number-of-patterns-and-speed-in-grep-tp4635613p4636472.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list