The current NUTS 2013 classification is valid from 1 January 2015 and lists 98 regions at NUTS 1, 276 regions at NUTS 2 and 1342 regions at NUTS 3 level.
je Vladimir Batagelj vladimir.batagelj@fmf.uni-lj.si napisal/-a:
Zgleda, da je na EUROSTATu mogoče dobiti piramide za NUTS 3.
Zgleda, da je mogoče dobiti tudi zemljevide
Morda bi lahko poskusili stvar združiti s simbolnim razvrščanjam z relacijsko omejitvijo. Naš pristop združimo z razvrščanjam z relacijsko omejitvijo.
Mislim, da ne bi smelo biti težav (razen morda nemonotonost/obrati) - le nekaj dela.
Mislim, da lahko prijavimo tudi ta prispevek. Vidim dve poti. V obeh primerih je potrebno:
Prva pot:
Druga pot (če bo več časa, ali za kasneje):
Morda bi za srečanje obljubili le prvo pot. Če bomo s časom zelo na tesnem, lahko vse skupaj naredimo le za US counties. Glede na to, da s Simono prijavljava druga prispevka, bi pri tem prispevku bila prvi avtor Nataša - v navadi je, da vsak udeleženec srečanja nastopi le enkrat kot prvi avtor.
Transforming EuroStat NUTS 3 population pyramids data into data frames.
In the file demo_r_pjangrp3.tsv
I first replaced all ”,” and “\t” with ”;”. I also corrected the header and saved the content to the file NUTSpopyr.csv
. Using a short program in R
> setwd("C:/Users/batagelj/data/popPyr") > T <- read.csv2("NUTSpopyr.csv",colClasses="character") > age <- c("Y_LT5", "Y5-9", "Y10-14", "Y15-19", "Y20-24", "Y25-29", "Y30-34", "Y35-39", + "Y40-44", "Y45-49", "Y50-54", "Y55-59", "Y60-64", "Y65-69", "Y70-74", "Y75-79", + "Y80-84", "Y85-89", "Y_GE85", "Y_GE90", "TOTAL", "UNK") > a <- factor(T$age,levels=age,ordered=TRUE) > g <- factor(T$geo); s <- factor(T$sex) > m <- length(levels(g)); k <- length(levels(a)); n <- nrow(T) > F <- matrix(0,nrow=m,ncol=k); M <- matrix(0,nrow=m,ncol=k) > for(i in 1:n){ + if(s[i]=="T") next + freq <- as.numeric(unlist(strsplit(T$y2016[i]," "))[1]) + j <- as.integer(g[i]); b <- as.integer(a[i]) + if(s[i]=="F") F[j,b] <- freq else M[j,b] <- freq + } > na <- levels(a); ng <- levels(g) > rownames(F) <- ng; colnames(F) <- na > rownames(M) <- ng; colnames(M) <- na > s <- nchar(ng)==5 > F3 <- F[s,]; M3 <- M[s,] > save(F3,M3,file="NUTSpopyr.Rdata") > F3df <- as.data.frame(F3); M3df <- as.data.frame(M3) > write.csv2(F3df,file="F3.csv"); write.csv2(M3df,file="M3.csv")
I produced two matrices F3 (female) and M3 (male) with pyramids. Rows represent NUTS3-units and columns correspond to age classes. They are saved on the file NUTSpopyr.Rdata
. Afterward I transformed both matrices into corresponding data frames F3df and M3df and saved them on files F3.csv
and M3.csv
.
The data are relatively complete. Only data for Albania (Excel lines 2-13) and MKXXX (Macedonia, Excel line 999) are missing. In the Excel lines 852-859 (IE - Ireland ?) the data in the column T (Y_GE85) are missing. It seems that this is not a problem because Y_GE85 = Y85-90 + Y_GE90 . Therefore the columns Y_GE85, TOTAL and UNK are needless.
To do: Transform the data into Clamix format.
> library(maptools) > setwd("C:/Users/batagelj/data/popPyr/NUTS_2013_60M_SH/data") > NUTS <- readShapeSpatial("NUTS_RG_60M_2013.shp") > plot(NUTS) > names(NUTS) [1] "NUTS_ID" "STAT_LEVL_" "SHAPE_AREA" "SHAPE_LEN" > head(NUTS$NUTS_ID) [1] AT AT1 AT11 AT111 AT112 AT113 1951 Levels: AT AT1 AT11 AT111 AT112 AT113 AT12 AT121 AT122 AT123 AT124 ... UKN05 > pdf("NUTS3pop.pdf",width=11.7,height=8.3,paper="a4r") > plot(NUTS,xlim=c(-24,44),ylim=c(28,70),lwd=0.05,bg="skyblue",col="wheat",main="NUTS 3") > dev.off() > SEPA <- readShapeSpatial("NUTS_SEPA_LI_2013.shp") > plot(SEPA) > LB <- readShapeSpatial("NUTS_LB_2013.shp") > plot(LB) > plot(LB,main="Labels",lwd=0.05) > BN <- readShapeSpatial("NUTS_BN_60M_2013.shp") > pdf("BNpop.pdf",width=11.7,height=8.3,paper="a4r") > plot(BN,xlim=c(-24,44),ylim=c(28,70),lwd=0.05,bg="skyblue",main="NUTS 3 / BN") > dev.off()
> proj4string(NUTS) [1] NA > proj4string(NUTS) <- "+proj=longlat +ellps=GRS80 +no_defs" > EU <- spTransform(NUTS, + CRS("+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 + +units=m +nadgrids=@null +no_defs")) > plot(EU) > bbox(EU) min max x -6880205 6214499 y -2433677 11445837 > bbox(NUTS) min max x -61.80593 55.8258 y -21.35013 71.1270 > proj4string(NUTS) [1] "+proj=longlat +ellps=GRS80 +no_defs" > proj4string(EU) [1] "+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +no_defs" > plot(NUTS,xlim=c(-24,44),ylim=c(28,70),lwd=0.05,bg="skyblue",col="white",main="NUTS3") > plot(EU,xlim=c(-2100000,4600000),ylim=c(3200000,11200000),lwd=0.05, + bg="skyblue",col="white",main="NUTS3 - Google")
> CountryBorder <- NUTS[NUTS@data$STAT_LEVL_ == 0, ] > pdf("EUcountries.pdf",width=11.7,height=8.3,paper="a4r") > plot(CountryBorder,xlim=c(-24,44),ylim=c(28,70),lwd=0.05, + bg="skyblue",col="white",main="NUTS0 - countries") > dev.off()
> sp2Pajek <- function(sp,file="neighbors.net",name=0,queen=TRUE){ + library(spdep) + nbs <- poly2nb(sp,queen=queen) + n <- length(nbs); L <- card(nbs) + xy <- coordinates(sp) + IDs <- as.character(if(name>0) sp[[name]] else 1:n) + net <- file(file,"w") + cat("% sp2Pajek:",date(),"\n*vertices",n,"\n",file=net) + for(i in 1:n) cat(i,' "',IDs[i],'" ',xy[i,1],' ',xy[i,2],' 0.5\n',sep='',file=net) + cat("*edgeslist\n",file=net) + for(i in 1:n) if(L[i]>0) cat(i,nbs[[i]],"\n",file=net) + close(net) + } > NUTS3 <- NUTS[NUTS@data$STAT_LEVL_ == 3, ] > plot(NUTS3,xlim=c(-24,44),ylim=c(28,70),lwd=0.05,bg="skyblue",col="white",main="NUTS3") > cat(date(),"\n"); sp2Pajek(NUTS3,name="NUTS_ID",file="NUTS.net"); cat(date(),"\n") Fri Apr 28 09:12:46 2017 Fri Apr 28 09:12:47 2017
We obtain o neighbors relation on the file NUTS.net
.
In its current version the neighbors network is not connected - there exist several connected components and isolated nodes.
These are more detailed shapes data. I repeated the procedures on them.
> setwd("C:/Users/batagelj/data/popPyr/NUTS_2013_01M_SH/data") > NUTS <- readShapeSpatial("NUTS_RG_01M_2013.shp") > plot(NUTS) > pdf("NUTS3popD.pdf",width=11.7,height=8.3,paper="a4r") > plot(NUTS,xlim=c(-24,44),ylim=c(28,70),lwd=0.05,bg="skyblue",col="white",main="NUTS3 - detailed") > dev.off()
Creating neighbors network from detailed map and drawing map with labels.
> setwd("C:/Users/batagelj/data/popPyr/NUTS_2013_01M_SH/data") > NUTSd <- readShapeSpatial("NUTS_RG_01M_2013.shp") > NUTSd3 <- NUTSd[NUTSd@data$STAT_LEVL_ == 3, ] > summary(NUTSd3) Object of class SpatialPolygonsDataFrame Coordinates: min max x -63.15346 55.83663 y -21.38731 71.18532 Is projected: NA proj4string : [NA] Data attributes: NUTS_ID STAT_LEVL_ SHAPE_AREA SHAPE_LEN AT111 : 1 Min. :3 Min. : 0.001292 Min. : 0.1503 AT112 : 1 1st Qu.:3 1st Qu.: 0.090017 1st Qu.: 2.0010 AT113 : 1 Median :3 Median : 0.222539 Median : 3.3631 AT121 : 1 Mean :3 Mean : 0.507958 Mean : 4.7413 AT122 : 1 3rd Qu.:3 3rd Qu.: 0.623426 3rd Qu.: 5.5373 AT123 : 1 Max. :3 Max. :21.726085 Max. :155.9323 (Other):1474 > plot(NUTSd3,xlim=c(-24,44),ylim=c(28,70),lwd=0.05,bg="skyblue",col="white",main="NUTS3 detailed") > cat(date(),"\n"); sp2Pajek(NUTSd3,name="NUTS_ID",file="NUTSd.net"); cat(date(),"\n") > P <- read.csv("weak.clu",head=FALSE,skip=1)$V1 > length(P) [1] 1480 > head(P) [1] 1 1 1 1 1 1 > table(P) P 0 1 2 3 4 5 6 7 8 9 10 40 1177 4 5 57 2 8 13 2 9 163 > > library(RColorBrewer) > pal <- brewer.pal(n = 12, name = "Set3") > pdf("NUTSd3lab.pdf",width=11.7,height=8.3,paper="a4r") > plot(NUTSd3,xlim=c(-24,44),ylim=c(28,70),lwd=0.05,bg="skyblue",col=pal[P+1]) > title("NUTS3 - detailed") > text(coordinates(NUTSd3),labels=as.character(NUTSd3$NUTS_ID),cex=0.1) > dev.off()
In the network NUTSd.net
every edge is listed twice. We remove duplicated edges using Pajek and save the network to the file NUTSdSimple.net
. On the file weak.clu
the partition of nontrivial weak components is saved. All isolated nodes are in the class 0. I manually constructed additional edges collected in a relation 3:“sea links” (see file sea.txt
) that make the network connected. The expanded network has name NUTSdSea.net
.
nutsd3lab.pdf, nutsdsimple.zip, nutsdsea.pdf
regular neighbors - blue; sea links - red.
> NUTSd0 <- NUTSd[NUTSd@data$STAT_LEVL_ == 0, ] > labs <- as.character(NUTSd0$NUTS_ID) > labs [1] "AT" "BE" "BG" "CY" "CZ" "CH" "DE" "DK" "ES" "EL" "EE" "FR" "FI" "IS" "IT" [16] "HU" "IE" "HR" "NO" "ME" "MT" "NL" "MK" "LU" "LI" "LT" "LV" "PL" "SE" "RO" [31] "PT" "TR" "SK" "SI" "UK" > colors <- c("green","blue","blue","red","yellow","blue","red","green", + "blue","green","green","green","blue","green","yellow","blue","red", + "green","red","blue","red","yellow","red","yellow","yellow","green", + "red","blue","yellow","yellow","red","yellow","red","red","yellow") > pdf("NUTS0.pdf",width=11.7,height=8.3,paper="a4r") > plot(NUTSd0,xlim=c(-24,44),ylim=c(28,70),lwd=0.1,bg="skyblue",col=colors) > title("NUTS0 / Europe - detailed") > text(coordinates(NUTSd0),labels=labs,cex=0.3) > dev.off()