[R] reordering huge data file

Boks, M.P.M. M.P.M.Boks at umcutrecht.nl
Mon Jan 21 22:45:41 CET 2008


Dear R-experts,

My problem is how to handle a 10GB data file containing genotype data. The file is in a particular format (Illumina final report) and needs to be altered and merged with phenotype data for further analysis.

PERL seems to be an frequently used solution for this type of work, however I am inclined to think it should be doable with R.

How do I open a text-file, line by line, evaluate it and write it back into a textfile in a different position; 

Phenotypeinfo.txt (contains phenotype information)

Before.txt (contains genotypeinformation -see below-)

SNP;1-305,000	ID:1-900	allele.A  alleleB


After.txt (the required format)

ID:1-250 phenotype SNP1.allelA	SNP1.alleleB	SNP2.Allele.A SNP2.allele.B etc


I have been looking at ?read.table/scan/readline/SQL-light but have not resolved it. Should I refer to PERL or can this be tackled?

I am using a windows machine with R 2.6.0 

Any help would be highly appreciated,

Many Thanks,

Marco



More information about the R-help mailing list