Downloads

Downloading files from a directory

November 27, 2013. There are interesting data at GDELT site: backfiles and dailyupdates. To download them with R I used the following short program.

November 10, 2017. The structure of raw data files changed.

setwd("C:/Users/batagelj/test/R/DL")
# pa <- "http://gdelt.utdallas.edu/data/backfiles/"
# L <- as.vector(read.csv("files1.dir",header=FALSE)$V1)
pa <- "http://gdelt.utdallas.edu/data/dailyupdates/" 
L <- as.vector(read.csv("files2.dir",header=FALSE)$V1)
length(L)
for(fn in L){
  fname <- paste(pa,fn,sep=""); cat("---",fn,date(),"\n") 
  test <- tryCatch(download.file(fname,fn,method="auto"),error=function(e) e)
}
date()

The files files1.dir and files2.dir contain the lists of filenames to be downloaded. For example files1.dir:

201303.zip
201302.zip
201301.zip
...
1980.zip
1979.zip

See also downloading genealogies.

August 10, 2017. When downloading from a page starting with https: we have to put at the beginning of commands the command

setInternet2(TRUE)
setwd("C:/Users/batagelj/data/graphBook")
pa <- "https://bitbucket.org/mvngu/graphbook-supplement/downloads/"
...

November 10, 2017. Function setInternet2 is Defunct.

Happy DB

Links

notes/data/dl.txt · Last modified: 2018/03/24 02:44 by vlado
 
Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Noncommercial-Share Alike 3.0 Unported
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki