[R] Automatically fix big jumps in one variable due to anomalies
Duncan Mackay
mackay at northnet.com.au
Tue Mar 5 04:18:31 CET 2013
Hi Cesar
Not sure what you actually want to accomplish
?rle may give you some ideas eg (I have added some to return to the
good section)
x = c(246,251,250,255,5987,5991,5994,599,255,259,262,267)
xdiff = diff(x)
xdiff
[1] 5 -1 5 5732 4 3 -5395 -344 4 3 5
rle(xdiff)
Run Length Encoding
lengths: int [1:11] 1 1 1 1 1 1 1 1 1 1 ...
values : num [1:11] 5 -1 5 5732 4 3 -5395 -344 4 3 ...
which(abs(rle(xdiff)[[2]] ) > 50)
[1] 4 7 8
rle(xdiff)[[2]][abs(rle(xdiff)[[2]] ) > 50]
It is then a matter of removing the required sequences or applying a
function to them or substituting values ?zoo::na.approx from memory
HTH
Duncan
Duncan Mackay
Department of Agronomy and Soil Science
University of New England
Armidale NSW 2351
Email: home: mackay at northnet.com.au
At 09:13 5/03/2013, you wrote:
>Hi,
>I am attaching a plot where you can see there are a few "jumps" (plots 1, 4,
>5 and 6), due to incidents with the measuring sensors (basically someone
>touching the sensor). I need to revert those changes to have a plot without
>unreal measurements, so make those fragments go back to its original pattern
>before the jump.
>
>I have used the function cpt.mean {changepoints} so I can identify the jumps
>and the mean of each segment. Now I don't know how to automatically revert
>the jumps, probably subtracting one higher fragment mean by the mean of the
>previous one. Does it make sense?
>
>Example of data set
>
> TIMESTAMP variable diameter
>38 2012-06-21 13:45:00 r4_3 NA
>86 2012-06-21 14:00:00 r4_3 NA
>134 2012-06-21 14:15:00 r4_3 246
>182 2012-06-21 14:30:00 r4_3 251
>230 2012-06-21 14:45:00 r4_3 250
>278 2012-06-21 15:00:00 r4_3 255
>326 2012-06-21 15:15:00 r4_3 5987
>374 2012-06-21 15:30:00 r4_3 5991
>422 2012-06-21 15:45:00 r4_3 5994
>470 2012-06-21 16:00:00 r4_3 5999
>
>As an example, this is the current diameter data:
>NA-NA-246-251-250-255-5987-5991-5994-599
>
>I would need this series without the big jump, avoiding the jump and
>following the increase/decrease pattern, for example:
>NA-NA-246-251-250-255-255-259-262-267
>
>Any other idea is welcome.
>
>
>
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list