[R] removing dropouts (setting the values to NA)
    Petr Pikal 
    petr.pikal at precheza.cz
       
    Fri Feb 14 07:49:03 CET 2003
    
    
  
Dear all 
I hope there is somebody who encountered similar problem and 
can give me a  hint how to do it or where to look. 
I have several data sets in DBF format. I can transfer them to R 
data frames and  then I want to perform aggregation or some 
other computations, but there are  values in my data which I can 
call drop-outs and I want them to be discarded (see  example).  
Usually I can find row of zeros (the measuring device is out of 
order or does not  obtain any data) or a gradual decrease of some 
measured values due to real  interruption of the process. I would 
like to do some evaluation (automatic) to set  an logical vector 
where, for instance, TRUE will stay for "correct" values and  
FALSE will be for "drop-outs" (or vice versa).  
Preferably I would like to ***discard few values before and after 
actual drop-out  occurred***. Then I will set all "wrong" values in 
my variables to NA and  continue further computations. 
Here is some foo code for making artificial drop-outs similar like 
in my actual  data 
x<-seq(0,100,.1) 
y<-sin(x)+rnorm(length(x),mean=0,sd=1) 
y1<-y-c(rep(0,200),exp(x[20:50]),rep(0,770)) 
y<-y1+50 
y<-y*(y>0) 
y[600:700]<-0 
My actual data looks like: 
Date, 		Time, 		Var1, 	Var2, 	Var3, ...... 
01.01.01, 	03:05:00, 	12, 	27, 	0.53, ..... 
01.01.01, 	03:05:15, 	12.2, 	29, 	1.2, ..... 
01.01.01, 	03:05:30, 	12.2, 	29, 	0, ..... 
......... 
   
in several data sets.  
I can simply put  
idx1<-y==0  
I can set an arbitrary limit under or over which the value is 
considered a drop-out  
idx2<-y<45 
and I can combine both indexes 
idx<-as.logical(idx1+idx2) 
But I do not know how easily enlarge the TRUE parts of index 
vector forwards  and backwards the actual drop-out occurred.  
The only way how I am able to accomplish it is  
changes<-seq(along=x)[as.logical(diff(idx))]+1 
than select odd an even values from changes subtract a certain 
value from odd  and add a value to even and construct something 
like that 
c(rep(F,odd[1]),rep(T,even[1]-odd[1]),rep(F,odd[2]-
even[1]),rep(T,even[2]- odd[2]),rep(F,length(x)-even[2])) 
what is a little bit complicated and not very general solution. 
Please can somebody help me find the better procedure or 
function for such drop- out filtering? 
Thank you. 
Petr Pikal
Precheza a.s., Nabř.Dr.E.BeneÜe 24, 750 62 Přerov
tel: +420581 252 257 ; 724 008 364
petr.pikal at precheza.cz; p.pik at volny.cz
fax +420581 252 561
    
    
More information about the R-help
mailing list