[R] Subsetting problem data
arun
smartpink111 at yahoo.com
Wed Jul 18 22:54:43 CEST 2012
Hello,
Not sure whether I understand it well.
If you want your output to include only Patient A &B,
this should work:
dat1<-read.table(text="
Patient Cycle Variable1 Variable2
A 1 4 5
A 2 3 3
A 3 4 NA
B 1 6 6
B 2 NA 6
C 1 6 5
C 3 2 2
",sep="",header=TRUE)
subset(dat1,!dat1$Patient=="C")
Patient Cycle Variable1 Variable2
1 A 1 4 5
2 A 2 3 3
3 A 3 4 NA
4 B 1 6 6
5 B 2 NA 6
But, if you want patient rows having NA's in either variable1 or 2,
subset(dat1,is.na(dat1$Variable1)|is.na(dat1$Variable2))
#Patient Cycle Variable1 Variable2
#3 A 3 4 NA
#5 B 2 NA 6
This will help to locate the patients which have missing values.
Hope it helps.
A.K.
----- Original Message -----
From: Lib Gray <libgray3827 at gmail.com>
To: r-help at r-project.org
Cc:
Sent: Wednesday, July 18, 2012 3:30 PM
Subject: [R] Subsetting problem data
Hello, I need to subset my data to only look at the parts that have "holes"
in it. I already have a formula to get rid of inconsistencies, but now I
need to look only at the problem data to reconfigure it. In my data set
where there are multiple "cycles" per "patient," and I want to highlight
the patients who have a variable was not measured every cycle.
Here's a similar example of the data:
Patient, Cycle, Variable1, Variable 2
A, 1, 4, 5
A, 2, 3, 3
A, 3, 4, NA
B, 1, 6, 6
B, 2, NA, 6
C, 1, 6, 5
C, 3, 2, 2
So in this case, I would want Patient A and Patient B, but not Patient C.
Thanks!
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list