November 27, 2013. There are interesting data at GDELT site: backfiles and dailyupdates. To download them with R I used the following short program.
November 10, 2017. The structure of raw data files changed.
setwd("C:/Users/batagelj/test/R/DL") # pa <- "http://gdelt.utdallas.edu/data/backfiles/" # L <- as.vector(read.csv("files1.dir",header=FALSE)$V1) pa <- "http://gdelt.utdallas.edu/data/dailyupdates/" L <- as.vector(read.csv("files2.dir",header=FALSE)$V1) length(L) for(fn in L){ fname <- paste(pa,fn,sep=""); cat("---",fn,date(),"\n") test <- tryCatch(download.file(fname,fn,method="auto"),error=function(e) e) } date()
The files files1.dir
and files2.dir
contain the lists of filenames to be downloaded. For example files1.dir
:
201303.zip 201302.zip 201301.zip ... 1980.zip 1979.zip
See also downloading genealogies.
August 10, 2017. When downloading from a page starting with https:
we have to put at the beginning of commands the command
setInternet2(TRUE) setwd("C:/Users/batagelj/data/graphBook") pa <- "https://bitbucket.org/mvngu/graphbook-supplement/downloads/" ...
November 10, 2017. Function setInternet2 is Defunct.
14. February 2018