[R] subsetting a data.frame
Peter Dalgaard
P.Dalgaard at biostat.ku.dk
Wed Oct 10 16:56:01 CEST 2007
jim holtman wrote:
> Is this what you want?
>
>
>> x <- read.table(textConnection("Score Name
>>
> + 88 000019_0070
> + 88 000019_0070
> + 87 000019_0070
> + 79 002127_0658
> + 79 002127_0658
> + 77 002127_0658"), header=TRUE)
>
>> # return best scores
>> best <- by(x, x$Name, function(.nam){
>>
> + .nam[which(.nam$Score == max(.nam$Score)),]
> + })
>
>> do.call('rbind', best)
>>
> Score Name
> 000019_0070.1 88 000019_0070
> 000019_0070.2 88 000019_0070
> 002127_0658.4 79 002127_0658
> 002127_0658.5 79 002127_0658
>
Or, (same idea. really)
> do.call(rbind,lapply(split(d, d$Name), subset, Score==max(Score)))
Score Name
000019_0070.1 88 000019_0070
000019_0070.2 88 000019_0070
002127_0658.4 79 002127_0658
002127_0658.5 79 002127_0658
Another idea, with the advantage of leaving data in the original order:
> ix <- d$Score == ave(d$Score, d$Name, FUN=max)
> d[ix,]
Score Name
1 88 000019_0070
2 88 000019_0070
4 79 002127_0658
5 79 002127_0658
--
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help
mailing list