[R] Performance of concatenating strings
Tamara Steijger
smara1 at gmx.de
Wed Oct 31 15:09:34 CET 2007
Hi,
thanks for the fast answers. I'm sorry, if I was not clear enough in
my question. The problem we are trying to solve is LetterDisplay.
There is already a heuristic implemented in the multcompView package
of Hans-Peter Piepho. We implemented an exact fixed parameter
tractable implementation for that problem (implementation in OCaml).
The new R function works similar like the original multcompLetters
function. But instead of running the heuristic the input is forwarded
to the OCaml program. Because the new multcompView function should
have the same return format as the original one (users don't have to
bother about the new implementation and still use the old code) the
output then has to be formatted in the same way as it is done by now
in multcompLetters.
The output of the OCaml program is a file containing a matrix. Names
that occur in the same row of the matrix are not significant
different, so they should have a common letter in the LetterDisplay,
i.e. each row in the matrix corresponds to one letter in the
LetterDisplay. Right now those LetterDisplays are build up character
by character like it was in the original function. But as you already
mentioned storing for each name the characters as a list (i.e. a
matrix with a row for each name and so many columns as lines in the
output file) and then concatenating all at once probably will improve
the performance already significantly. And runtime is in this case
probably more critical than memory issues.
I didn't use Rprof so far, but I will definitely try it. But just to
get an idea of what the problem is: The input for a LetterDisplay can
also be given as a graph. One of our test instances is a graph with
121 nodes and 5108 edges. The OCaml program needs ca. 1 second to
compute the result. But then computing the LetterDisplays needs more
than two hours ...
Thanks a lot,
T. Steijger
More information about the R-help
mailing list