[R-sig-Geo] Counting points within polygons
Edzer Pebesma
edzer.pebesma at uni-muenster.de
Tue Oct 18 21:30:15 CEST 2011
On 10/18/2011 06:32 PM, Agustin Lobo wrote:
> I'm afraid I do not understand well the help page.
> It seems to me that what I need is
> "x = "SpatialPointsDataFrame", y = "SpatialPolygons"
> equal to the previous method, except that an argument fn=xxx is
> allowed, e.g. fn = mean which will then report a data.frame with the
> mean values of the x points falling in each polygon (set) of y"
No, you need the other way around -- this is what I got wrong with
overlay, long ago, where argument order did not matter. over(a,b) is the
opposite of over(b,a)
The text of the "over" documentation is hard to read, and I find the
concept hard to explain in words, without giving examples. That's why I
wrote the vignette.
> Thus:
>> delme4<- over(fireLC2[,5], BorneoLC,fn=sum)
> where
>> BorneoLC at data[1,]
> LC ID LC_2 LCname LCabrv AREA PERIMETER
> 0 4 4505 4 Lowland Forest LwF 7.7e-05 0.036385
> and
>> fireLC2 at data[1,]
> LC ID LCname LCabrv nbp
> 1 7 1 Plantation_Secondary Forest PltSF 1
>
> but I get
> Error in FUN(X[[1L]], ...) :
> only defined on a data frame with all numeric variables
> Calls: over ... do.call -> lapply -> Summary.data.frame -> lapply -> FUN
>
It would be helpful if you could provide a reproducible script that
leads to this error, or even the command that gave you this error.
> which I do not understand because fireLC2[,5] is numeric
> (suggestion: could the user define the field(s) for which he wants fn
> being applied? )
How about passing an object with numeric attributes (selected)?
> Beyond that, where the help page says:
> "...which will then report a data.frame with the mean values of the x
> points falling in each polygon (set) of y",
> what happens if there is no points falling for a given polygon? Would
> the record be eliminated or kept as NA ?
>> From my point of view, having the option of getting a data.frame with
> the same nb. of rows and same ordering as
> the input Sp Polygons object would be an advantage.
That is why over was written to replace overlay.
I modified the example in "over" to illustrate your two cases:
############ start script
require(sp)
r1 = cbind(c(180114, 180553, 181127, 181477, 181294, 181007, 180409,
180162, 180114), c(332349, 332057, 332342, 333250, 333558, 333676,
332618, 332413, 332349))
r2 = cbind(c(180042, 180545, 180553, 180314, 179955, 179142, 179437,
179524, 179979, 180042), c(332373, 332026, 331426, 330889, 330683,
331133, 331623, 332152, 332357, 332373))
r3 = cbind(c(179110, 179907, 180433, 180712, 180752, 180329, 179875,
179668, 179572, 179269, 178879, 178600, 178544, 179046, 179110),
c(331086, 330620, 330494, 330265, 330075, 330233, 330336, 330004,
329783, 329665, 329720, 329933, 330478, 331062, 331086))
r4 = cbind(c(180304, 180403,179632,179420,180304),
c(332791, 333204, 333635, 333058, 332791))
sr1=Polygons(list(Polygon(r1)),"r1")
sr2=Polygons(list(Polygon(r2)),"r2")
sr3=Polygons(list(Polygon(r3)),"r3")
sr4=Polygons(list(Polygon(r4)),"r4")
sr=SpatialPolygons(list(sr1,sr2,sr3,sr4))
srdf=SpatialPolygonsDataFrame(sr, data.frame(cbind(1:4,5:2),
row.names=c("r1","r2","r3","r4")))
data(meuse)
coordinates(meuse) = ~x+y
plot(meuse)
polygon(r1)
polygon(r2)
polygon(r3)
polygon(r4)
# retrieve mean heavy metal concentrations per polygon:
# attribute means over each polygon, NA for empty
over(sr, meuse[,1:4], fn = mean)
# return the number of points in each polygon:
sapply(over(sr, geometry(meuse), returnList = TRUE), length)
######## end script
Get some grip on "over". Forget overlay, it will get deprecated. "over"
doesn't stop with sp; you'll find further "over" methods in packages
rgeos and spacetime.
Hth,
--
Edzer Pebesma
Institute for Geoinformatics (ifgi), University of Münster
Weseler Straße 253, 48151 Münster, Germany. Phone: +49 251
8333081, Fax: +49 251 8339763 http://ifgi.uni-muenster.de
http://www.52north.org/geostatistics e.pebesma at wwu.de
More information about the R-sig-Geo
mailing list