====== NBER world trade ====== At [[http://cid.econ.ucdavis.edu/data/undata/undata.html|NBER]] we can find the file ''wtf_bilat.zip'' with World Trade data 1962-2000. I transformed them into Pajek's network files. Editing the temporal data about the countries I noticed that it seems that something is wrong with data about Yemen: row icode importer ecode exporter value62 value63 value64 value65 12620 447200 Fm Yemen Ar 100000 World NA NA NA NA 12621 447200 Fm Yemen Ar 218400 USA NA NA NA NA 12644 447200 Fm Yemen Dm 218400 USA 3360 5871 5201 5571 13292 448860 Fm Yemen Dm 218400 USA NA NA NA NA 13179 448860 Fm Yemen AR 100000 World 778 19684 2661 4219 Only two rows contain "Fm Yemen Ar" and "Fm Yemen Dm" has two ''icode''s "447200" and "448860" (only 2). I looked at the problematic rows: row icode importer ecode exporter value62 value63 value64 value65 value66 value67 value68 value69 value70 value71 value72 value73 value74 value75 value76 value77 value78 value79 value80 value81 value82 value83 value84 value85 value86 value87 value88 value89 value90 value91 value92 value93 value94 value95 value96 value97 value98 value99 value00 12620 447200 Fm Yemen Ar 100000 World NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 3250 NA NA NA NA NA NA NA NA NA NA 13179 448860 Fm Yemen AR 100000 World 778 19684 2661 4219 8651 14035 8581 15089 34506 43310 68287 103640 184953 240838 411477 575559 579746 1497011 1853362 1319369 1262913 917408 992265 995505 834513 707096 894567 850158 243769 NA NA NA NA NA NA NA NA NA NA 12622 447200 Fm Yemen Dm 100000 World 107749 147213 188182 206424 184437 156211 139636 96013 117979 86015 89326 80201 231689 145121 223394 404759 326421 777965 1091928 728527 552712 437086 443575 347539 249078 275054 279867 236631 980000 NA NA NA NA NA NA NA NA NA NA 13291 448860 Fm Yemen Dm 100000 World NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 103951 NA NA NA NA NA NA NA NA NA NA 13293 448870 Yemen 100000 World NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 1491931 2024425 1957762 1536104 1587878 1780324 1807679 1781852 1638044 1501013 12621 447200 Fm Yemen Ar 218400 USA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 3250 NA NA NA NA NA NA NA NA NA NA 13207 448860 Fm Yemen AR 218400 USA NA NA NA NA NA NA NA NA NA NA 2216 9567 10432 7326 15464 18392 19845 28376 52196 32094 33005 97305 65432 38145 80177 112527 75052 67741 NA NA NA NA NA NA NA NA NA NA NA 12644 447200 Fm Yemen Dm 218400 USA 3360 5871 5201 5571 4806 3023 3053 2500 2737 834 920 2613 12316 2711 3827 13807 24457 11915 5226 4413 7254 6067 60533 7766 16398 13502 5495 6217 NA NA NA NA NA NA NA NA NA NA NA 13292 448860 Fm Yemen Dm 218400 USA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 103951 NA NA NA NA NA NA NA NA NA NA 13298 448870 Yemen 218400 USA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 185726 316322 312260 172644 178558 251537 149328 172904 151582 NA I applied the following "correction": > library(foreign) > setwd("E:/work/trade") > t <- read.dta("wtf_bilat.dta") > dim(t) [1] 23949 43 > t$value90[13207] <- 103951 > t$value90[12644] <- 3250 > t <- t[-c(12620,12621,13291,13292),] > dim(t) [1] 23945 43 and rebuild the \Pajek's files. Here is a description of the procedure how to transform the corrected World trade data stored in the data frame ''t'' into Pajek's temporal network. # transforming World Trade data 1962-2000 into Pajek's temporal network # http://cid.econ.ucdavis.edu/data/undata/undata.html - wtf_bilat.zip # Vladimir Batagelj, 11. jun 2013 f <- factor(c(t$importer,t$exporter)) L <- levels(f); n <- length(L); m <- nrow(t) imf <- factor(t$importer,L); exf <- factor(t$exporter,L) net <- file("WorldTrade.net","w") cat("NBER World Trade -> Pajek",date(),'\n') cat("% NBER World Trade -> Pajek",date(),'\n',file=net) cat("% http://cid.econ.ucdavis.edu/data/undata/undata.html\n",file=net) cat("*vertices", n,"\n",file=net) for(v in 1:n) cat(v,' "',L[v],'" [1962-2000]\n',sep='',file=net) cat("*arcs\n",file=net) for(r in 1:m){ v <- imf[r]; u <- exf[r] for(c in 5:43) if(!is.na(t[r,c])) cat(v,' ',u,' ',t[r,c],' [',c+1957,']\n',sep='',file=net) } cat("Finished",date(),'\n') close(net) # NBER country codes cu <- unique(t[,c(1,2)]) p <- match(L,cu$importer) C <- cu$icode[p] C[40] <- "481561" nam <- file("WorldTrade.nam","w") cat("% NBER country codes\n",file=nam) cat("*vertices", n,"\n",file=nam) for(v in 1:n) cat(v,' "',C[v],'"\n',sep='',file=nam) close(nam) # clustering by continents clu <- file("continent.clu","w") cat("% country clustering by continents\n",file=clu) cat("*vertices", n,"\n",file=clu) for(v in 1:n) cat(substr(C[v],1,1),'\n',file=clu) close(clu) The years of vertex activities are manually corrected according to the data from table from pages 52-57 in [[http://cid.econ.ucdavis.edu/data/undata/NBER-UN_Data_Documentation_w11040.pdf|NBER-UN_Data_Documentation]]. The Pajek's network is available in {{pub:net:nberwt.zip}}. In the 2000-slice India has missing data. The 1999-slice is more complete and has clear center-periphery structure. For recoding I used the quartal values 1633, 11830 and 98090. The 1999-slice and related files are available in {{pub:net:nberwt99.zip}}.