[R] Subsetting problem data
Rui Barradas
ruipbarradas at sapo.pt
Thu Jul 19 18:56:57 CEST 2012
Hello,
Try the following.
d <- read.csv(text="
Patient, Cycle, Variable1, Variable2
A, 1, 4, 5
A, 2, 3, 3
A, 3, 4, NA
B, 1, 6, 6
B, 2, NA, 6
C, 1, 6, 5
C, 3, 2, 2
", header=TRUE)
d
compl <- lapply(split(d, d$Patient), function(x) if(all(diff(x$Cycle) ==
1)) x)
holes <- lapply(split(d, d$Patient), function(x) if(any(diff(x$Cycle) !=
1)) x)
do.call(rbind, compl)
do.call(rbind, holes)
In the mean time, you have posted another question similar but
apparently more complete. I'll see to it, but tell something, is this
answer completely off? If you just want to know whether there are holes,
TRUE/FALSE answers, this other version might do it.
aggregate(Cycle ~ Patient, data=d, function(x) any(diff(x) != 1))
Hope this helps,
Rui Barradas
Em 18-07-2012 20:30, Lib Gray escreveu:
> Hello, I need to subset my data to only look at the parts that have "holes"
> in it. I already have a formula to get rid of inconsistencies, but now I
> need to look only at the problem data to reconfigure it. In my data set
> where there are multiple "cycles" per "patient," and I want to highlight
> the patients who have a variable was not measured every cycle.
>
> Here's a similar example of the data:
>
> Patient, Cycle, Variable1, Variable 2
> A, 1, 4, 5
> A, 2, 3, 3
> A, 3, 4, NA
> B, 1, 6, 6
> B, 2, NA, 6
> C, 1, 6, 5
> C, 3, 2, 2
>
> So in this case, I would want Patient A and Patient B, but not Patient C.
>
> Thanks!
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list