[R] transforming a badly organized data base into a list of data	frames
    Dimitri Liakhovitski 
    ld7631 at gmail.com
       
    Fri Sep  4 14:57:04 CEST 2009
    
    
  
Dear R-ers!
I have a badly organized data base in Excel. Once I read it into R it
looks like this (all variables become factors because of many spaces
and other characters in Excel):
x<-data.frame(A=c("","Name1","text1","text2","text3","","","","Name2","text1","text2","text3","","","Name3","text1","text2","text3","","Name1","text1","text2","text3",""),
B=c("","",1,2,3,"","","","",4,5,6,"","","",7,8,9,"","",3,3,3,""),C=c("","",1,1,2,"","","","",5,6,5,"","","",9,8,7,"","",2,2,2,""),
D=c("","",3,2,1,"","","","",6,5,4,"","","",7,7,8,"","",1,1,1,""))
x
I was wondering if it's at all possible to grab the data blocks from
this file and put them into a list, e.g., so that the first element of
the list is the data frame:
Name1 1 1 3
Name12 1 2
Name13 2 1
second element of the list is the data frame
Name2 4 5 6
Name2 5 6 5
Name2 6 5 4
etc.
Notice that all blocks have the same number of rows.
I have several challenges:
1. I cannot use the row number because the number of irrelevant lines
between blocks is always different.
2. I'd love to replace all the "text1", "text2", "text3" strings
(which are always the same within each block) with the name that is
one cell above it.
3. All variables are now factors - I am afraid if I recode them later
as "as numeric" - they numbers will change their values.
Thank you very much for your suggestions!
-- 
Dimitri Liakhovitski
Ninah.com
Dimitri.Liakhovitski at ninah.com
    
    
More information about the R-help
mailing list