====== Downloads ====== ===== Downloading files from a directory ===== November 27, 2013. There are interesting data at [[https://www.gdeltproject.org/|GDELT]] site: [[http://gdelt.utdallas.edu/data/backfiles/?O=D|backfiles]] and [[http://gdelt.utdallas.edu/data/dailyupdates/?O=D|dailyupdates]]. To download them with R I used the following short program. November 10, 2017. The structure of [[https://www.gdeltproject.org/data.html#rawdatafiles|raw data files]] changed.


setwd("C:/Users/batagelj/test/R/DL")
# pa <- "http://gdelt.utdallas.edu/data/backfiles/"
# L <- as.vector(read.csv("files1.dir",header=FALSE)$V1)
pa <- "http://gdelt.utdallas.edu/data/dailyupdates/" 
L <- as.vector(read.csv("files2.dir",header=FALSE)$V1)
length(L)
for(fn in L){
  fname <- paste(pa,fn,sep=""); cat("---",fn,date(),"\n") 
  test <- tryCatch(download.file(fname,fn,method="auto"),error=function(e) e)
}
date()

The files ''files1.dir'' and ''files2.dir'' contain the lists of filenames to be downloaded. For example ''files1.dir'':


201303.zip
201302.zip
201301.zip
...
1980.zip
1979.zip

See also [[notes:gendl|downloading genealogies]]. August 10, 2017. When downloading from a page starting with ''https:'' we have to put at the beginning of commands the command


setInternet2(TRUE)
setwd("C:/Users/batagelj/data/graphBook")
pa <- "https://bitbucket.org/mvngu/graphbook-supplement/downloads/"
...

November 10, 2017. Function setInternet2 is Defunct. ===== Happy DB ===== 14. February 2018 * https://rit-public.github.io/HappyDB/ * https://github.com/rit-public/HappyDB * https://www.technologyreview.com/s/610159/100000-happy-moments/ * https://arxiv.org/abs/1801.07746 * C:\Users\batagelj\Downloads\data\happy ===== Links ===== * https://www.dataquest.io/blog/10-data-science-projects-join/?imm_mid=0fb29a&cmp=em-data-na-na-newsltr_20180214 * http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005968 * https://datadryad.org//resource/doi:10.5061/dryad.73r6j?show=full *