[R] Skipping lines and incomplete rows

arun smartpink111 at yahoo.com
Tue Jul 10 19:30:11 CEST 2012


Hello Ravi,

I was not aware that your dataset have special character "#" before NA.  If it was just plain NA, it would have worked.  So, It's not because of sep= ";".

See below:

#Without "#"
dat1<-read.table(text="
 Remove this line
 Remove this line
 Remove this line
 Time;Actual Speed;Actual Direction;Temp;Press;Value1;Value2
  ;[m/s];[°];°C;[hPa];[MWh];[MWh]
 1/1/2012;0.0;0;NA;NA;0.0000;0.0000
 1/2/2012;0.0;0;NA;NA;0.0000;0.0000
 1/3/2012;0.0;0;NA;NA;1.5651;2.2112
 1/4/2012;0.0;0;NA;NA;1.0000;2.0000
 1/5/2012;0.0;0;NA;NA;3.2578;7.5455
 ",sep=";",header=TRUE,fill=TRUE,skip=4,stringsAsFactors=FALSE)
> dat1
      Time Actual.Speed Actual.Direction Temp Press Value1 Value2
1                 [m/s]              [°]   °C [hPa]  [MWh]  [MWh]
2 1/1/2012          0.0                0 <NA>  <NA> 0.0000 0.0000
3 1/2/2012          0.0                0 <NA>  <NA> 0.0000 0.0000
4 1/3/2012          0.0                0 <NA>  <NA> 1.5651 2.2112
5 1/4/2012          0.0                0 <NA>  <NA> 1.0000 2.0000
6 1/5/2012          0.0                0 <NA>  <NA> 3.2578 7.5455


#With "#": Reading data from the .txt file.  

# In the documentation (http://stat.ethz.ch/R-manual/R-devel/library/utils/html/read.table.html), comment.char="#" is an option in the read.table, but unfortunately it shows only blank columns after the first three columns.  


#I think Rui's method of reading header separately using readLines might be a good option.  Or if you know the columnheadings, then you can do this:

dat2<-read.table("dat2.txt",skip=4,col.names=c("Time","Actual Speed","Actual Direction", "Temp","Press","Value1","Value2"),fill=TRUE,sep=";",comment.char="c")
> dat2
      Time Actual.Speed Actual.Direction Temp Press Value1 Value2
1                 [m/s]              [°]   °C [hPa]  [MWh]  [MWh]
2 1/1/2012          0.0                0  #NA   #NA 0.0000 0.0000
3 1/2/2012          0.0                0  #NA   #NA 0.0000 0.0000
4 1/3/2012          0.0                0  #NA   #NA 1.5651 2.2112
5 1/4/2012          0.0                0  #NA   #NA 1.0000 2.0000
6 1/5/2012          0.0                0  #NA   #NA 3.2578 7.5455


A.K.










----- Original Message -----
From: vioravis <vioravis at gmail.com>
To: r-help at r-project.org
Cc: 
Sent: Tuesday, July 10, 2012 1:41 AM
Subject: Re: [R] Skipping lines and incomplete rows

Thanks a lot Rui and Arun.

The methods work fine with the data I gave but when I tried the two methods
with the following semi-colon separated data using sep = ";". Only the first
3 columnns are read properly rest of the columns are either empty or NAs.


**********************************************************************************************
Remove this line
Remove this line
Remove this line
Time;Actual Speed;Actual Direction;Temp;Press;Value1;Value2
;[m/s];[°];°C;[hPa];[MWh];[MWh]
1/1/2012;0.0;0;#N/A;#N/A;0.0000;0.0000
1/2/2012;0.0;0;#N/A;#N/A;0.0000;0.0000
1/3/2012;0.0;0;#N/A;#N/A;1.5651;2.2112
1/4/2012;0.0;0;#N/A;#N/A;1.0000;2.0000
1/5/2012;0.0;0;#N/A;#N/A;3.2578;7.5455
***********************************************************************************************

I used the following code:
dat1<-read.table("testInput.txt",sep=";",skip=3,fill=TRUE,header=TRUE) 
dat1<-dat1[-1,] 
row.names(dat1)<-1:nrow(dat1)

Could you please let me know what is wrong with this approach? 

Thank you.

Ravi

--
View this message in context: http://r.789695.n4.nabble.com/Skipping-lines-and-incomplete-rows-tp4635830p4635952.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list