====== Neighbors ====== * http://www.esri.com/news/arcuser/0401/topo.html * may be as binarized weights matrix (page 6) http://www.s4.brown.edu/s4/training/modul2/GeoDa2.pdf * http://gis.stackexchange.com/questions/17457/efficiently-finding-the-1st-order-neighbors-of-200k-polygons * http://www.bias-project.org.uk/ASDARcourse/ ; http://www.bias-project.org.uk/ASDARcourse/unit6_slides.pdf The solution is copied from the slides from the "Introduction to R" workshop, Wednesday, Thursday, & Friday, January 5- 7, 2011, IBS, Colorado, from slides on "Autocorrelation and Spatial Regression" prepared by Elisabeth Root. ===== Elisabeth Root's Solution ===== Neighborhoods can be defined in a number of ways * Contiguity (common boundary). What is a “shared” boundary? * Distance (distance band, K-nearest neighbors). How many “neighbors” to include, what distance do we use? * General weights (social distance, distance decay) > install.packages("ctv") > library("ctv") > install.views("Spatial") > library(maptools) > library(rgdal) > library(spdep) > library(maptools) > getinfo.shape("C:/Users/ERoot/Desktop/R/sids2.shp") Shapefile type: Polygon, (5), # of Shapes: 100 > sids<-readShapePoly ("C:/Users/ERoot/Desktop/R/sids2.shp") > class(sids) [1] "SpatialPolygonsDataFrame" attr(,"package") > library(rgdal) > sids<-readOGR dsn="C:/Users/ERoot/Desktop/R",layer="sids2") OGR data source with driver: ESRI Shapefile Source: "C:/Users/Elisabeth Root/Desktop/Quant/R/shapefiles", layer: "sids2" with 100 features and 18 fields Feature type: wkbPolygon with 2 dimensions > class(sids) [1] "SpatialPolygonsDataFrame" attr(,"package") [1] "sp" If the shapefile has no .prj file associated with it, you need to assign a coordinate system > proj4string(sids)<-CRS("+proj=longlat ellps=WGS84") We can then transform the map into any projection > sids_NAD<-spTransform(sids, CRS("+init=epsg:3358")) > sids_SP<-spTransform(sids, CRS("+init=ESRI:102719")) For a list of applicable CRS codes: http://www.spatialreference.org/ref/ Stick with the epsg and esri codes Areas sharing any boundary point (QUEEN) are taken as neighbors, using the poly2nb function, which accepts a SpatialPolygonsDataFrame > library(spdep) > sids_nbq<-poly2nb(sids) If contiguity is defined as areas sharing more than one boundary point (ROOK), the queen= argument is set to FALSE > sids_nbr<-poly2nb(sids, queen=FALSE) > coords<-coordinates(sids) > plot(sids) > plot(sids_nbq, coords, add=T) Can also choose the k nearest points as neighbors > coords<-coordinates(sids_SP) > IDs<-row.names(as(sids_SP, "data.frame")) > sids_kn1<-knn2nb(knearneigh(coords, k=1), row.names=IDs) > sids_kn2<-knn2nb(knearneigh(coords, k=2), row.names=IDs) > sids_kn4<-knn2nb(knearneigh(coords, k=4), row.names=IDs) > plot(sids_SP) > plot(sids_kn2, coords, add=T) Can also assign neighbors based on a specified distance > dist<-unlist(nbdists(sids_kn1, coords)) > summary(dist) Min. 1st Qu. Median Mean 3rd Qu. Max. 40100 89770 97640 96290 107200 134600 > max_k1<-max(dist) > sids_kd1<-dnearneigh(coords, d1=0, d2=0.75*max_k1, row.names=IDs) > sids_kd2<-dnearneigh(coords, d1=0, d2=1*max_k1, row.names=IDs) > sids_kd3<-dnearneigh(coords, d1=0, d2=1.5*max_k1, row.names=IDs) OR by raw distance > sids_ran1<-dnearneigh(coords, d1=0, d2=134600, row.names=IDs) Binary weights > sids_nbq_wb<-nb2listw(sids_nbq, style="B") > sids_nbq_wb Characteristics of weights list: Neighbour list object: Number of regions: 100 Number of nonzero links: 490 Percentage nonzero weights: 4.9 Average number of links: 4.9 Weights style: B Weights constants summary: n nn S0 S1 S2 B 100 10000 490 980 10696 If you ever get the following error: Error in nb2listw(filename): Empty neighbor sets found You have some regions that have NO neighbors This is most likely an artifact of your GIS data (digitizing errors, slivers, etc), which you should fix in a GIS. Also could have “true” islands (e.g., Hawaii, San Juans in WA). May want to use k nearest neighbors. Or add ''zero.policy=T'' to the ''nb2listw'' call: > sids_nbq_w<-nb2listw(sids_nbq, zero.policy=T) ===== sp2Pajek ===== To produce a Pajek's network file of neighbors we need: names of nodes (''IDs'') and their coordinates (''coords'') and the lists of neighbors (''sids_nbq''): > coords<-coordinates(sids_SP) > names(sids_SP) > IDs<-as.character(sids_SP$NAME) > plot(sids_SP,col="lightyellow",border="red",bg="lightcyan",axes=TRUE) > points(coords,pch=16,col="blue",cex=0.2) > text(coords,IDs,cex=0.3) > net <- file("sosedi.net","w") > n <- length(sids_nbq); L <- card(sids_nbq) > cat("*vertices",n,"\n",file=net) > for(i in 1:n) cat(i,' "',IDs[i],'" ',coords[i,1],' ',coords[i,2],' 0.5\n',sep='',file=net) > cat("*edgeslist\n",file=net) > for(i in 1:n) if(L[i]>0) cat(i,sids_nbq[[i]],"\n",file=net) > close(net) We pack the solution into ''sp2Pajek'' procedure sp2Pajek <- function(sp,file="neighbors.net",name=0,queen=TRUE){ library(spdep) nbs <- poly2nb(sp,queen=queen) n <- length(nbs); L <- card(nbs) xy <- coordinates(sp) IDs <- as.character(if(name>0) sp[[name]] else 1:n) net <- file(file,"w") cat("% sp2Pajek:",date(),"\n*vertices",n,"\n",file=net) for(i in 1:n) cat(i,' "',IDs[i],'" ',xy[i,1],' ',xy[i,2],' 0.5\n',sep='',file=net) cat("*edgeslist\n",file=net) for(i in 1:n) if(L[i]>0) cat(i,nbs[[i]],"\n",file=net) close(net) } To read the shape file we can use either the library ''maptools'' > library(maptools) > setwd("D:/data/counties/shape/") > sids<-readShapePoly("D:/data/counties/shape/data/sids2.shp") > source("D:\\Data\\counties\\shape\\sp2Pajek.R") > sp2Pajek(sids,file="t.net",name=5) > sp2Pajek(sids,file="test.net",name="NAME") or the library ''rgdal'' > library(rgdal) > setwd("D:/data/counties/shape/") > USst <- readOGR(dsn="D:/data/counties/9nations/USA",layer="USA_adm1") > source("D:\\Data\\counties\\shape\\sp2Pajek.R") > cat(date(),"\n"); sp2Pajek(USst,name=5,file="USst.net"); cat(date(),"\n") Fri Dec 30 09:34:31 2011 Fri Dec 30 10:46:58 2011 For larger shape files, as we see, it can take some time to get the results. > setwd("C:/users/Batagelj/test/R/maps") > SIcom <- readShapeSpatial("OB.shp") > cat(date(),"\n"); sp2Pajek(SIcom,name="OB_UIME",file="OB11.net"); cat(date(),"\n") Fri Dec 30 14:16:10 2011 Fri Dec 30 14:16:16 2011 ===== shp2Pajek ===== We can also produce a two-relational network of neighbors with ''rook'' relation and with additional ''queen'' links. shp2Pajek <- function(pl,file="neighbors.net",name=0){ library(spdep); cat("shp2Pajek:",date(),"/"); flush.console() nbr<-poly2nb(pl,queen=FALSE); nbq<-poly2nb(pl,queen=TRUE) n <- length(nbr); L <- card(nbr) xy <- coordinates(pl) IDs <- as.character(if(name>0) pl[[name]] else 1:n) net <- file(file,"w") cat("% shp2Pajek:",date(),"\n*vertices",n,"\n",file=net) for(i in 1:n) cat(i,' "',IDs[i],'" ',xy[i,1],' ',xy[i,2],' 0.5\n',sep='',file=net) cat('*edgeslist :1 "rook"\n',file=net) for(i in 1:n) if(L[i]>0) cat(i,nbr[[i]],"\n",file=net) cat('*edgeslist :2 "queen"\n',file=net) L <- card(nbq)-card(nbr) for(i in 1:n) if(L[i]>0) cat(i,setdiff(nbq[[i]],nbr[[i]]),"\n",file=net) cat(date(),"\n"); close(net) } > library(maptools); setwd("C:/users/Batagelj/test/R/maps") > SIcom <- readShapeSpatial("OB.shp") > shp2Pajek(SIcom,name="OB_UIME",file="OB11.net") shp2Pajek: Fri Dec 30 23:53:57 2011 /Fri Dec 30 23:54:08 2011 ===== Selecting subregion ===== > P <- c("Koper","Izola","Piran","Hrpelje-Kozina","Divača","Sežana","Ilirska Bistrica", + "Pivka","Vipava","Postojna","Ajdovščina","Komen") > pri <- SIcom[SIcom$OB_UIME %in% P,] > plot(pri)