====== NUTS ====== The current NUTS 2013 classification is valid from 1 January 2015 and lists 98 regions at NUTS 1, 276 regions at NUTS 2 and 1342 regions at NUTS 3 level. * [[http://ec.europa.eu/eurostat/web/nuts/overview|NUTS - overview]] * [[http://ec.europa.eu/eurostat/web/nuts/nuts-maps-.pdf-|NUTS - Nomenclature of Territorial Units for Statistics]] * [[http://vladowiki.fmf.uni-lj.si/doku.php?id=notes:rmap|Maps in R]] ===== 17. april 2017 ob 13:00 ===== je Vladimir Batagelj napisal/-a: Zgleda, da je na EUROSTATu mogoče dobiti piramide za NUTS 3. * http://ec.europa.eu/eurostat/estat-navtree-portlet-prod/BulkDownloadListing * http://ec.europa.eu/eurostat/estat-navtree-portlet-prod/BulkDownloadListing?sort=1&file=BulkDownload_Guidelines.pdf * http://ec.europa.eu/eurostat/estat-navtree-portlet-prod/BulkDownloadListing?sort=1&file=table_of_contents_en.pdf * http://ec.europa.eu/eurostat/estat-navtree-portlet-prod/BulkDownloadListing?file=data/demo_r_pjangrp3.tsv.gz * http://ec.europa.eu/eurostat/statistics-explained/index.php/Population_statistics_at_regional_level Zgleda, da je mogoče dobiti tudi zemljevide * http://ec.europa.eu/eurostat/web/gisco/geodata/reference-data/administrative-units-statistical-units/nuts Morda bi lahko poskusili stvar združiti s simbolnim razvrščanjam z relacijsko omejitvijo. Naš pristop združimo z razvrščanjam z relacijsko omejitvijo. * http://vlado.fmf.uni-lj.si/vlado/papers/CluRelCon.pdf * http://vlado.fmf.uni-lj.si/vlado/papers/stCluRC.pdf * http://journals.aps.org/pre/abstract/10.1103/PhysRevE.91.062814 Mislim, da ne bi smelo biti težav (razen morda nemonotonost/obrati) - le nekaj dela. ===== 21. april 2017 ob 12:05 ===== Mislim, da lahko prijavimo tudi ta prispevek. Vidim dve poti. V obeh primerih je potrebno: - pretvoriti podatke v simbolno obliko (za US counties že imamo, za Evropo še ne) - ustvariti relacijo sosednosti (za US counties jo že imam, za Evropo jo dobimo iz datoteke shape (opis zemljevida) [[notes:net:shape|net from shape]] Prva pot: - relacijo sosednosti predelamo v uteženo Pajkovo omrežje, kjer so uteži simbolne različnosti med krajišči povezav. Napisati je potrebno ustrezno rutino v R ali Pythonu. - nad omrežjem uporabimo razvrščanje z omejitvami iz Pajka - opisano v naši knjigi Understanding ... Če je čas, lahko stvar sprogramiramo tudi v R-ju (po mojem ne bi smelo biti več kot 50 vrstic programa). - predstavnike lahko naračunamo naknadno iz dobljene (hierarhične) razvrstitve - dodatna funkcija v R-ju. Druga pot (če bo več časa, ali za kasneje): - razvijemo postopek, ki uporablja našo kriterijsko funkcijo tudi za razvrščanje in sproti določa predstavnike. Morda bi za srečanje obljubili le prvo pot. Če bomo s časom zelo na tesnem, lahko vse skupaj naredimo le za US counties. Glede na to, da s Simono prijavljava druga prispevka, bi pri tem prispevku bila prvi avtor Nataša - v navadi je, da vsak udeleženec srečanja nastopi le enkrat kot prvi avtor. ===== 26-28. april 2017 ===== ==== Transforming EuroStat NUTS 3 ==== Transforming EuroStat NUTS 3 population pyramids data into data frames. In the file ''demo_r_pjangrp3.tsv'' I first replaced all "," and "\t" with ";". I also corrected the header and saved the content to the file ''NUTSpopyr.csv''. Using a short program in R > setwd("C:/Users/batagelj/data/popPyr") > T <- read.csv2("NUTSpopyr.csv",colClasses="character") > age <- c("Y_LT5", "Y5-9", "Y10-14", "Y15-19", "Y20-24", "Y25-29", "Y30-34", "Y35-39", + "Y40-44", "Y45-49", "Y50-54", "Y55-59", "Y60-64", "Y65-69", "Y70-74", "Y75-79", + "Y80-84", "Y85-89", "Y_GE85", "Y_GE90", "TOTAL", "UNK") > a <- factor(T$age,levels=age,ordered=TRUE) > g <- factor(T$geo); s <- factor(T$sex) > m <- length(levels(g)); k <- length(levels(a)); n <- nrow(T) > F <- matrix(0,nrow=m,ncol=k); M <- matrix(0,nrow=m,ncol=k) > for(i in 1:n){ + if(s[i]=="T") next + freq <- as.numeric(unlist(strsplit(T$y2016[i]," "))[1]) + j <- as.integer(g[i]); b <- as.integer(a[i]) + if(s[i]=="F") F[j,b] <- freq else M[j,b] <- freq + } > na <- levels(a); ng <- levels(g) > rownames(F) <- ng; colnames(F) <- na > rownames(M) <- ng; colnames(M) <- na > s <- nchar(ng)==5 > F3 <- F[s,]; M3 <- M[s,] > save(F3,M3,file="NUTSpopyr.Rdata") > F3df <- as.data.frame(F3); M3df <- as.data.frame(M3) > write.csv2(F3df,file="F3.csv"); write.csv2(M3df,file="M3.csv") I produced two matrices F3 (female) and M3 (male) with pyramids. Rows represent NUTS3-units and columns correspond to age classes. They are saved on the file ''NUTSpopyr.Rdata''. Afterward I transformed both matrices into corresponding data frames F3df and M3df and saved them on files ''F3.csv'' and ''M3.csv''. The data are relatively complete. Only data for Albania (Excel lines 2-13) and MKXXX (Macedonia, Excel line 999) are missing. In the Excel lines 852-859 (IE - Ireland ?) the data in the column T (Y_GE85) are missing. It seems that this is not a problem because Y_GE85 = Y85-90 + Y_GE90 . Therefore the columns Y_GE85, TOTAL and UNK are needless. {{notes:da:data:nutspopyr.zip}} **To do:** Transform the data into Clamix format. ==== Drawing maps and creating neigbours relation ==== See also http://vladowiki.fmf.uni-lj.si/doku.php?id=notes:net:shape === NUTS_2013_60M_SH === > library(maptools) > setwd("C:/Users/batagelj/data/popPyr/NUTS_2013_60M_SH/data") > NUTS <- readShapeSpatial("NUTS_RG_60M_2013.shp") > plot(NUTS) > names(NUTS) [1] "NUTS_ID" "STAT_LEVL_" "SHAPE_AREA" "SHAPE_LEN" > head(NUTS$NUTS_ID) [1] AT AT1 AT11 AT111 AT112 AT113 1951 Levels: AT AT1 AT11 AT111 AT112 AT113 AT12 AT121 AT122 AT123 AT124 ... UKN05 > pdf("NUTS3pop.pdf",width=11.7,height=8.3,paper="a4r") > plot(NUTS,xlim=c(-24,44),ylim=c(28,70),lwd=0.05,bg="skyblue",col="wheat",main="NUTS 3") > dev.off() > SEPA <- readShapeSpatial("NUTS_SEPA_LI_2013.shp") > plot(SEPA) > LB <- readShapeSpatial("NUTS_LB_2013.shp") > plot(LB) > plot(LB,main="Labels",lwd=0.05) > BN <- readShapeSpatial("NUTS_BN_60M_2013.shp") > pdf("BNpop.pdf",width=11.7,height=8.3,paper="a4r") > plot(BN,xlim=c(-24,44),ylim=c(28,70),lwd=0.05,bg="skyblue",main="NUTS 3 / BN") > dev.off() {{notes:da:pdf:nuts3pop.pdf}}, {{notes:da:pdf:labels.pdf}}, {{notes:da:pdf:bnpop.pdf}} == Changing projection == > proj4string(NUTS) [1] NA > proj4string(NUTS) <- "+proj=longlat +ellps=GRS80 +no_defs" > EU <- spTransform(NUTS, + CRS("+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 + +units=m +nadgrids=@null +no_defs")) > plot(EU) > bbox(EU) min max x -6880205 6214499 y -2433677 11445837 > bbox(NUTS) min max x -61.80593 55.8258 y -21.35013 71.1270 > proj4string(NUTS) [1] "+proj=longlat +ellps=GRS80 +no_defs" > proj4string(EU) [1] "+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +no_defs" > plot(NUTS,xlim=c(-24,44),ylim=c(28,70),lwd=0.05,bg="skyblue",col="white",main="NUTS3") > plot(EU,xlim=c(-2100000,4600000),ylim=c(3200000,11200000),lwd=0.05, + bg="skyblue",col="white",main="NUTS3 - Google") == Selecting levels == > CountryBorder <- NUTS[NUTS@data$STAT_LEVL_ == 0, ] > pdf("EUcountries.pdf",width=11.7,height=8.3,paper="a4r") > plot(CountryBorder,xlim=c(-24,44),ylim=c(28,70),lwd=0.05, + bg="skyblue",col="white",main="NUTS0 - countries") > dev.off() {{notes:da:pdf:eucountries.pdf}} == Creating neighbors network == > sp2Pajek <- function(sp,file="neighbors.net",name=0,queen=TRUE){ + library(spdep) + nbs <- poly2nb(sp,queen=queen) + n <- length(nbs); L <- card(nbs) + xy <- coordinates(sp) + IDs <- as.character(if(name>0) sp[[name]] else 1:n) + net <- file(file,"w") + cat("% sp2Pajek:",date(),"\n*vertices",n,"\n",file=net) + for(i in 1:n) cat(i,' "',IDs[i],'" ',xy[i,1],' ',xy[i,2],' 0.5\n',sep='',file=net) + cat("*edgeslist\n",file=net) + for(i in 1:n) if(L[i]>0) cat(i,nbs[[i]],"\n",file=net) + close(net) + } > NUTS3 <- NUTS[NUTS@data$STAT_LEVL_ == 3, ] > plot(NUTS3,xlim=c(-24,44),ylim=c(28,70),lwd=0.05,bg="skyblue",col="white",main="NUTS3") > cat(date(),"\n"); sp2Pajek(NUTS3,name="NUTS_ID",file="NUTS.net"); cat(date(),"\n") Fri Apr 28 09:12:46 2017 Fri Apr 28 09:12:47 2017 We obtain o neighbors relation on the file ''NUTS.net''. {{notes:da:data:nutsnet.zip}} In its current version the neighbors network is not connected - there exist several connected components and isolated nodes. === NUTS_2013_01M_SH === These are more detailed shapes data. I repeated the procedures on them. > setwd("C:/Users/batagelj/data/popPyr/NUTS_2013_01M_SH/data") > NUTS <- readShapeSpatial("NUTS_RG_01M_2013.shp") > plot(NUTS) > pdf("NUTS3popD.pdf",width=11.7,height=8.3,paper="a4r") > plot(NUTS,xlim=c(-24,44),ylim=c(28,70),lwd=0.05,bg="skyblue",col="white",main="NUTS3 - detailed") > dev.off() {{notes:da:pdf:nuts3popd.pdf}} Creating neighbors network from detailed map and drawing map with labels. > setwd("C:/Users/batagelj/data/popPyr/NUTS_2013_01M_SH/data") > NUTSd <- readShapeSpatial("NUTS_RG_01M_2013.shp") > NUTSd3 <- NUTSd[NUTSd@data$STAT_LEVL_ == 3, ] > summary(NUTSd3) Object of class SpatialPolygonsDataFrame Coordinates: min max x -63.15346 55.83663 y -21.38731 71.18532 Is projected: NA proj4string : [NA] Data attributes: NUTS_ID STAT_LEVL_ SHAPE_AREA SHAPE_LEN AT111 : 1 Min. :3 Min. : 0.001292 Min. : 0.1503 AT112 : 1 1st Qu.:3 1st Qu.: 0.090017 1st Qu.: 2.0010 AT113 : 1 Median :3 Median : 0.222539 Median : 3.3631 AT121 : 1 Mean :3 Mean : 0.507958 Mean : 4.7413 AT122 : 1 3rd Qu.:3 3rd Qu.: 0.623426 3rd Qu.: 5.5373 AT123 : 1 Max. :3 Max. :21.726085 Max. :155.9323 (Other):1474 > plot(NUTSd3,xlim=c(-24,44),ylim=c(28,70),lwd=0.05,bg="skyblue",col="white",main="NUTS3 detailed") > cat(date(),"\n"); sp2Pajek(NUTSd3,name="NUTS_ID",file="NUTSd.net"); cat(date(),"\n") > P <- read.csv("weak.clu",head=FALSE,skip=1)$V1 > length(P) [1] 1480 > head(P) [1] 1 1 1 1 1 1 > table(P) P 0 1 2 3 4 5 6 7 8 9 10 40 1177 4 5 57 2 8 13 2 9 163 > > library(RColorBrewer) > pal <- brewer.pal(n = 12, name = "Set3") > pdf("NUTSd3lab.pdf",width=11.7,height=8.3,paper="a4r") > plot(NUTSd3,xlim=c(-24,44),ylim=c(28,70),lwd=0.05,bg="skyblue",col=pal[P+1]) > title("NUTS3 - detailed") > text(coordinates(NUTSd3),labels=as.character(NUTSd3$NUTS_ID),cex=0.1) > dev.off() In the network ''NUTSd.net'' every edge is listed twice. We remove duplicated edges using Pajek and save the network to the file ''NUTSdSimple.net''. On the file ''weak.clu'' the partition of nontrivial weak components is saved. All isolated nodes are in the class 0. I manually constructed additional edges collected in a relation 3:"sea links" (see file'' sea.txt'') that make the network connected. The expanded network has name'' NUTSdSea.net''. {{notes:da:pdf:nutsd3lab.pdf}}, {{notes:da:data:nutsdsimple.zip}}, {{notes:da:pdf:nutsdsea.pdf}} regular neighbors - blue; sea links - red. === Coloring countries === > NUTSd0 <- NUTSd[NUTSd@data$STAT_LEVL_ == 0, ] > labs <- as.character(NUTSd0$NUTS_ID) > labs [1] "AT" "BE" "BG" "CY" "CZ" "CH" "DE" "DK" "ES" "EL" "EE" "FR" "FI" "IS" "IT" [16] "HU" "IE" "HR" "NO" "ME" "MT" "NL" "MK" "LU" "LI" "LT" "LV" "PL" "SE" "RO" [31] "PT" "TR" "SK" "SI" "UK" > colors <- c("green","blue","blue","red","yellow","blue","red","green", + "blue","green","green","green","blue","green","yellow","blue","red", + "green","red","blue","red","yellow","red","yellow","yellow","green", + "red","blue","yellow","yellow","red","yellow","red","red","yellow") > pdf("NUTS0.pdf",width=11.7,height=8.3,paper="a4r") > plot(NUTSd0,xlim=c(-24,44),ylim=c(28,70),lwd=0.1,bg="skyblue",col=colors) > title("NUTS0 / Europe - detailed") > text(coordinates(NUTSd0),labels=labs,cex=0.3) > dev.off() ===== URLs ===== * [[http://www.milanor.net/blog/maps-in-r-choropleth-maps/|Maps in R: choropleth-maps]] * [[https://rpubs.com/adam_dennett/8955|Mapping Data from Eurostat using R]] * http://r.789695.n4.nabble.com/mapping-data-to-a-geographic-map-of-Europe-td4679287.html * http://www.uni-regensburg.de/wirtschaftswissenschaften/vwl-tschernig/medien/programmieren-mit-r/pir_tut_geomapping.r * https://github.com/davidbrae/swmap/blob/master/how%20to%20map%20the%20european%20social%20survey.R * http://ec.europa.eu/eurostat/statistics-explained/index.php/Research_and_innovation_statistics_at_regional_level * https://stat.ethz.ch/pipermail/r-sig-geo/2010-August/009151.html * http://r.789695.n4.nabble.com/Map-of-Europe-at-NUTS-2-Level-td4718431.html * https://www.r-bloggers.com/unemployment-of-europe-in-2014-by-nuts-2-region/ * http://ropengov.github.io/r/2015/05/01/eurostat-package-examples/ * http://www.ucd.ie/happy/index_files/Technical_description_for_Data_Merging.pdf * https://www.reddit.com/r/MapPorn/duplicates/4uint0/female_to_male_ratio_by_nuts_level_3_region_1600/ * https://github.com/rOpenGov/eurostat/blob/master/vignettes/eurostat_tutorial.md * http://www.bnosac.be/index.php/blog/55-belgiummaps-statbel-r-package-with-administrative-boundaries-of-belgium * http://www.joyofdata.de/blog/tag/choropleth-map/ * http://www.europeansocialsurvey.org/data/multilevel/guide/about.html * https://www.espon.eu/export/sites/default/Documents/ToolsandMaps/ESPON2013Database/2.7a_TR_Mapping_guide__internal.pdf * http://stat405.had.co.nz/ggmap.pdf * http://worldgrids.org/doku.php/wiki:admin_data * https://cran.r-project.org/web/packages/eurostat/vignettes/eurostat_tutorial.pdf * http://eclr.humanities.manchester.ac.uk/index.php/Maps_in_R * https://gis.stackexchange.com/questions/182400/what-is-the-epsg-code-of-eurostat-maps * https://stat.ethz.ch/pipermail/r-sig-geo/2010-August/009141.html * http://wukan.ums-riate.fr/FOSS4G-fr-2016/FOSS4G-fr-2016.html * http://www.europeansocialsurvey.org/data/multilevel/guide/about.html * http://geoexamples.blogspot.si/2013_10_01_archive.html * https://www.reddit.com/r/europe/comments/5mywwd/all_63_map_charts_in_eurostats_regional_yearbook/ * http://ftp.jrc.es/EURdoc/JRC85163.pdf * https://www.datacamp.com/community/tutorials/15-questions-about-r-plots#gs.6I9GKgQ * https://artax.karlin.mff.cuni.cz/r-help/library/R2BayesX/html/plotmap.html * http://www.molecularecologist.com/2012/09/making-maps-with-r/ * http://www.geog.uoregon.edu/GeogR/examples/maps_examples02.htm * http://www.jameskeirstead.ca/blog/changing-the-fonts-in-r-plots/ * http://stackoverflow.com/questions/7325180/how-to-specify-font-size-in-a-plot-for-pdf-output * http://blog.revolutionanalytics.com/2009/01/10-tips-for-making-your-r-graphics-look-their-best.html * https://rpubs.com/lmullen/46313