[R] How to conditionally remove dataframe rows?
Marc Schwartz
marc_schwartz at me.com
Thu Mar 7 14:43:07 CET 2013
Just to add another option to what Arun has provided below. That approach is very generalizable to data frames with >2 columns, where you want to filter based upon a finding a maximum value (or other perhaps more complex criteria) within one or more grouping columns and return all of the columns in the original data frame.
In this special case of a two column data frame, you can use ?aggregate easily with a formula based approach that might be easier to read. aggregate() essentially encapsulates what Arun has done below.
Thus:
> DF
Point_counts Psi_Sp
1 A 0
2 A 1
3 B 1
4 B 2
5 B 0
6 C 1
7 D 1
8 D 2
> aggregate(Psi_Sp ~ Point_counts, data = DF, max)
Point_counts Psi_Sp
1 A 1
2 B 2
3 C 1
4 D 2
Regards,
Marc Schwartz
On Mar 6, 2013, at 8:42 PM, arun <smartpink111 at yahoo.com> wrote:
> Hi,
>
> dfrm<- read.table(text="
> Point_counts Psi_Sp
>
> 1 A 0
> 2 A 1
> 3 B 1
> 4 B 2
> 5 B 0
> 6 C 1
> 7 D 1
> 8 D 2
> ",sep="",header=TRUE,stringsAsFactors=FALSE)
> res<-do.call(rbind,lapply(split(dfrm,dfrm$Point_counts),function(x) x[which.max(x$Psi_Sp),]))
> row.names(res)<-1:nrow(res)
> # Point_counts Psi_Sp
> #1 A 1
> #2 B 2
> #3 C 1 #your input data doesn't have 0
> #4 D 2
> A.K.
>
>
>
> ----- Original Message -----
> From: Francisco Carvalho Diniz <chicocdiniz at gmail.com>
> To: r-help at r-project.org
> Cc:
> Sent: Wednesday, March 6, 2013 6:21 PM
> Subject: [R] Fwd: How to conditionally remove dataframe rows?
>
> Hi,
>
> I have a data frame with two columns. I need to remove duplicated rows in
> first column, but I need to do it conditionally to values of the second
> column.
>
> Example:
>
> Point_counts Psi_Sp
>
> 1 A 0
> 2 A 1
> 3 B 1
> 4 B 2
> 5 B 0
> 6 C 1
> 7 D 1
> 8 D 2
>
>
> I need to turn this data frame in one without duplicated rows at
> point-counts (one visit per point) but maintain the ones with maximum value
> at Psi_Sp, e.g. remove row 1 and maintain 2 or remove rows 3 and 5 and
> maintain 4. At the end I want a data frame like the one below:
>
> Point_counts Psi_Sp
>
> 1 A 1
> 2 B 2
> 3 C 0
> 4 D 2
>
> How can I do it? I found several ways to edit data frames, but
> unfortunately I cound not use none of them.
>
> I appreciate
>
> Francisco
More information about the R-help
mailing list