====== Network Data Sources ====== ===== Huge networks ===== * [[https://commoncrawl.atlassian.net/wiki/pages/viewpage.action?pageId=4292610|Common Crawl]] * [[http://webdatacommons.org/hyperlinkgraph/index.html|Web Data Commons - Hyperlink Graph]] * [[http://www.gdeltproject.org/|GDELT project]], [[data:pajek:gdelt|GDELT]] ===== Collections of Network Data ===== * [[http://vlado.fmf.uni-lj.si/pub/networks/data/|Old Pajek Datasets]] * [[http://networkdata.ics.uci.edu/index.html|The UCI Network Data Repository]] * [[http://socialcomputing.asu.edu/pages/datasets|Social Computing Data Repository at Arizona State University]] * [[http://www-personal.umich.edu/~mejn/netdata/|Mark Newman's Network data]] * [[http://snap.stanford.edu/data/|SNAP - Stanford Large Network Dataset Collection]] * [[http://networkrepository.com/|Network repository]] * [[http://law.di.unimi.it/datasets.php|Laboratory for Web Algorithms]] * [[http://webscope.sandbox.yahoo.com/catalog.php?datatype=g|Yahoo labs]] * [[http://www.casos.cs.cmu.edu/computational_tools/datasets/|CASOS - datasets]]; [[http://www.casos.cs.cmu.edu/tools/data.php|ORA]];[[http://www.casos.cs.cmu.edu/computational_tools/data2.php|external]]; * [[http://netwiki.amath.unc.edu/SharedData/SharedData|Netwiki datasets]] * [[http://www.stats.ox.ac.uk/~snijders/siena/siena_datasets.htm|Siena datasets]]; * [[http://konect.uni-koblenz.de/networks|KONECT - Koblenz network collection]]; [[http://www.rene-pickhardt.de/download-network-graph-data-sets-from-konect-the-koblenz-network-colection/|about]] * [[https://nwb.slis.indiana.edu/community/?n=Datasets.HomePage|NetworkWorkbench]] * [[http://www.cise.ufl.edu/research/sparse/matrices/|University of Florida Sparse Matrix Collection]]; see also [[http://www.research.att.com/~yifanhu/GALLERY/GRAPHS/index49.html|Yifan's gallery]] * [[http://toreopsahl.com/datasets/|Tore Opsahl's Networks]] * [[http://wiki.gephi.org/index.php?title=Datasets|Gephi Datasets]]; [[http://wiki.gephi.org/index.php/Datasets|alt]] * [[http://www.trustlet.org/wiki/Trust_network_datasets|Trust network datasets]]; [[http://www.trustlet.org/wiki/Extended_Epinions_dataset|Massa]] * [[http://www.bgu.ac.il/~bargera/tntp/|Transport networks]] * [[http://www.dis.uniroma1.it/~challenge9/papers.shtml|Shortest paths Challenge]] / [[http://www.dis.uniroma1.it/~challenge9/download.shtml|data sets]] * Eric D. Kolaczyk: [[http://math.bu.edu/people/kolaczyk/SAND.html|Statistical Analysis of Network Data]]; [[http://math.bu.edu/people/kolaczyk/datasets.html|data sets]] * [[http://psfaculty.ucdavis.edu/zmaoz/datasets.htm|Zeev Maoz]] - conflict networks; * RDF: [[http://www4.wiwiss.fu-berlin.de/lodcloud/|Datasets in the next LOD Cloud]] * [[http://www.eelkeheemskerk.nl/index.php?/datasets/|Dutch Corporate Network Datasets]] * [[http://gking.harvard.edu/data]] * [[http://www1.cs.columbia.edu/~coms6998/datasets.htm]] * [[http://clair.si.umich.edu/clair/anthology/index.cgi|ACL networks]]; [[http://clair.si.umich.edu/clair/aan/DatasetContents.html|Corpus]]; [[http://www.aclweb.org/anthology-new/|Anthology]] * [[http://www.kde.cs.uni-kassel.de/datasets|Kassel / BibSonomy]] * [[http://atlas.gregas.eu/|Gregas Graphs]] * [[http://www.boardsandgender.com/data.php|Norwegian gender representation]] * [[http://lsnaworkshop.netii.net/ebsn/index.php#datasets|Event-based Social Networks]] * [[http://hog.grinvin.org/|The House of Graphs]] * [[http://math.nist.gov/~RPozo/complex_datasets.html|NIST]]; [[http://trec.nist.gov/data/tweets/|tweets]] * [[http://www.google.com/googlebooks/uspto.html|USPTO]]/Google * http://arnetminer.org/download ; http://arnetminer.org/citation * http://www.correlatesofwar.org/ * NodeXL [[http://www.nodexlgraphgallery.org/Pages/Default.aspx|gallery]] - download the data * https://wiki.gephi.org/index.php?title=Datasets * http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/07767 * http://crawdad.cs.dartmouth.edu/ * Airplanes: http://www.transtats.bts.gov/DL_SelectFields.asp?Table_ID=310 * US counties - mobility: http://www.census.gov/hhes/migration/data/acs/county_to_county_mig_2006_to_2010.html * http://www8.gsb.columbia.edu/leadership/research/smallworlds/datadl#Select%20Country * http://www.cc.gatech.edu/dimacs10/downloads.shtml * http://www.dis.uniroma1.it/challenge9/download.shtml * http://staffweb.cms.gre.ac.uk/~c.walshaw/partition/ * [[https://bitbucket.org/mvngu/graphbook-supplement|Graphbook graphs]] * http://mypersonality.org/wiki/doku.php?id=download_databases * http://dl.ucd.ie/?page_id=846 ==== Genealogies ==== * [[http://intersci.ss.uci.edu/wiki/index.php/Kinship,_Class,_and_Community|Kinship, Class, and Community]] * [[https://www.kinsources.net/|KinSources:Kinship Data Repository]] * [[http://projet-simpa.net/index.php?option=com_docman&Itemid=5&lang=en|Simpa]] * [[http://eclectic.ss.uci.edu/~drwhite/linkages/datasets/student.html|D.R. White]] * [[http://www.genealogyforum.com/gedcom/|Genealogy Forum GEDCOMs]] ===== Single Networks ===== * [[http://www.pfeffer.at/sunbelt2013/|Sunbelt 2013]] * [[http://www.org-soz.uni-wuppertal.de/fileadmin/soziologie/org-soz/files/Heidler_et_al_2013_Relationship_patterns_in_the_19th_century.gephi|Delitsch class / Gephi]] * [[http://www.stats.ox.ac.uk/~snijders/siena/Glasgow_data.htm|Teenage Friends and Lifestyle Study]] * http://gdelt.utdallas.edu/data.html#gkg * [[http://www.isi.edu/~adibi/Enron/Enron.htm|Enron]], [[http://www.isi.edu/~adibi/Enron/Enron_Dataset_Report.pdf|Report]]; [[http://www.cs.cmu.edu/~enron/|CMU]] * [[http://webdocs.cs.ualberta.ca/~lindek/downloads.htm|Thesaurus of similar words]] * Friendster Social Network: [[http://www.archive.org/details/friendster-dataset-201107| Dataset: Friends]]; [[http://www.archive.org/details/friendster-groups-201107|Groups]] * [[http://diseasome.eu/poster.html|The Human Disease Network]] * [[http://mgto.org/marketing-scholars-social-network-analysis-free-dataset/|Marketing Scholars Social Network Analysis Free Dataset]] * [[http://www.boardsandgender.com/|Boards and gender]]; * [[http://www.cs.cornell.edu/projects/kddcup/|KDD cup 2003]]; * [[http://www.informatik.uni-freiburg.de/~cziegler/BX/|Book-Crossing Dataset ]]; * [[http://jhfowler.ucsd.edu/cosponsorship.htm|Cosponsorships in the U.S. Senate and U.S. House of Representatives for the 93rd to 108th Congresses]]; * [[http://www.cs.umd.edu/hcil/VASTchallenge08/|VAST challenge 08]]; * [[http://www.cs.umd.edu/hcil/VASTchallenge09|VAST challenge 09]]; * [[https://www.cia.gov/library/publications/the-world-factbook/appendix/appendix-b.html|CIA appendix B :: international organizations and groups]]; * [[http://www.prosper.com/tools/DataExport.aspx|daily snapshot in the Prosper Marketplace]]; * [[http://iv.slis.indiana.edu/ref/iv04contest/index.html|Infovis 2004 contest]]; [[http://ella.slis.indiana.edu/~lviswana/iv04-contest.mdb|Cleaned]]; * http://www.sis.pitt.edu/~dparra/datasets.html * http://www.freebase.com/ ; http://wiki.freebase.com/wiki/Graphd * http://en.wikipedia.org/wiki/DBpedia * http://www.freebase.com/ ; http://wiki.freebase.com/wiki/Data_dumps * [[http://gking.harvard.edu/data?dvn_subpage=/faces/study/StudyPage.xhtml?studyId=505|10 Million International Dyadic Events]] * http://cfinder.org/wiki/?n=Main.Data * http://dvn.iq.harvard.edu/dvn/dv/patent/faces/study/StudyPage.xhtml?globalId=hdl:1902.1/15705&versionNumber=3 * http://networkdata.ics.uci.edu/index.php * https://nodexlgraphgallery.org/Pages/GraphML.ashx?graphID=1027 * http://www.barabasilab.com/pubs/CCNR-ALB_Publications/200705-14_PNAS-HumanDisease/Suppl/index.htm ; http://diseasome.eu/index.html ===== Lists of Data Sources ===== * [[http://datamob.org/datasets/|Data mob]]: [[http://datamob.org/datasets/tag/social-networks|Social networks]], [[http://datamob.org/datasets/show/facebook-dataset|Facebook]] * [[http://www.datawrangling.com/some-datasets-available-on-the-web|DataWrangling]] ~ 400 URLs * [[http://math.nist.gov/~RPozo/complex_datasets.html|Complex Network Resources]] * [[http://kevinchai.net/datasets/|Kevin Chai's datasets]] * [[http://networkdata.ics.uci.edu/resources.php|UCI links]] * [[http://cid.econ.ucdavis.edu/data/undata/undata.html|Trade: NBER / Feenstra]]; * [[http://comtrade.un.org/|Global trade]]; * bitly hmason: [[https://bitly.com/bundles/hmason/1|research-quality data sets]] * [[http://netwiki.amath.unc.edu/SharedData/SharedData|NetWiki Data]]; * Visual Analytics Benchmark Repository (Catherine Plaisant): [[http://hcil.cs.umd.edu/localphp/hcil/vast/archive/viewbm.php|Benchmarks]]; [[http://hcil.cs.umd.edu/localphp/hcil/vast/archive/other.php|Other Datasets]]; ===== Other Network Data Sources ===== * [[http://dvn.iq.harvard.edu/dvn/|The Institute for Quantitative Social Science at Harvard Dataverse Network]]; [[http://dvn.iq.harvard.edu/dvn/dv/king|Gary King Dataverse]]; [[http://thedata.org/|The Dataverse Network Project]]; * [[http://infochimps.com/|infochimps]]: [[http://infochimps.com/tags/4613|4613]], [[http://infochimps.com/tags/4488|4488]], [[http://infochimps.com/tags/socialnetwork|Social network]], [[http://infochimps.com/tags/Social|Social]], [[http://www.infochimps.com/search?query=network|network]], [[http://www.infochimps.com/tags/network?page=5|network:5]]; [[http://www.infochimps.com/tags/networking|networking]], [[http://blog.infochimps.com/2008/12/29/massive-scrape-of-twitters-friend-graph/|Twitter]] * [[http://www.trustlet.org/wiki/|Trustlet Wiki]]: [[http://www.trustlet.org/wiki/Trust_network_datasets|Trust nets]], [[http://www.trustlet.org/wiki/Repositories_of_datasets|Repositories]] * [[http://services.alphaworks.ibm.com/manyeyes/browse/data|Many Eyes]]; * [[http://www.public.asu.edu/~mdechoud/datasets.html|YouTube, Flickr, ...]] * [[http://www.causality.inf.ethz.ch/repository.php|ETHZ ChaLearn]] * [[http://www.cs.umd.edu/projects/linqs/projects/lbc/index.html|Linqs - Link-based Classification]] * [[http://www.caida.org/data/data-usage-faq.xml|Caida]] * [[http://archive.ics.uci.edu/ml/index.html|UCI ML repository]] * [[http://www.kdnuggets.com/datasets/|KDnuggets - Datasets for Data Mining]] * [[http://www.kddcup-orange.com/data.php|KDD Cup 2009 - Orange]] * [[http://developer.amazonwebservices.com/connect/kbcategory.jspa?categoryID=243|Amazon: Public Data Sets]] * Gene Ontology: [[http://www.geneontology.org/GO.tools.microarray.shtml|Tools]]; [[http://www.geneontology.org/GO.downloads.annotations.shtml|Data]] * [[http://www.grouplens.org/node/12|GroupLens; recommender data]] * Munmun De Choudhury: [[http://www.public.asu.edu/~mdechoud/datasets.html|Data sets]] * [[http://www.malawi.pop.upenn.edu/Level 3/Survey data/level3_quantitative_main.html|Social Network Projects - Kenya, Malawi]] * [[http://www.sscnet.ucla.edu/soc/faculty/wimmer/Datasets.html|Andreas Wimmer's Data sets]] * [[http://www.economicsnetwork.ac.uk/links/data_free|Economic Data freely available online]] * [[http://www.cs.cmu.edu/~awm/10701/project/data.html|CMU data for projects]] * [[http://delicious.com/pskomoroch/dataset|pskomoroch's data sets]] * [[http://www.diggingintodata.org/Repositories/tabid/167/Default.aspx|Repositories]] * [[http://sociology.rutgers.edu/ucds/ucds.htm|The Urban Communes Data Set]] * [[http://isites.harvard.edu/icb/icb.do?keyword=k16229&pageid=icb.page499251|China Biographical Database(CBDB)]]: social networks for Chinese historical elites from 7th to 19th centuries. * [[http://stackoverflow.com/questions/3340810/twitter-social-networking-dataset|Twitter Info]] * [[http://www.race.u-tokyo.ac.jp/~uchida/blogdata/|Weblog Dataset Archive and Visualization]] * [[http://www.jazzdiscography.com/|Jazzdiscography]]; [[http://www.sciencedirect.com/science/article/pii/S1751157712000326|paper]] * [[http://chianti.ucsd.edu/svn/cytoscapeweb/tags/cytoscapeweb-0.4/html-template/fixtures/|Cytoscape]] * http://cybermetrics.wlv.ac.uk/database/ * [[http://www.cis.hut.fi/research/som-bibl/|Bibliography of SOM papers]] * http://www.kalevleetaru.com/ ===== Mixed ===== * [[http://pewinternet.org/Data-Tools/Download-Data.aspx|Internet & American Life Project ]] * The Sociograph: [[http://sociograph.blogspot.com/2011/02/visualizing-large-facebook-friendship.html|Visualizing Large Facebook Friendship Networks]]; [[http://sociograph.blogspot.com/2011/03/facebook100-data-and-parser-for-it.html|Data]] * Economics Web Institute: [[http://www.economicswebinstitute.org/ecdata.htm|Data]] * Ontario: [[http://www.ene.gov.on.ca/environment/en/resources/collection/data_downloads/index.htm|environment]] * String data base [[http://rsat.ulb.ac.be/string_dataset_form.php|protein-protein interactions]] * HSR General Resources: [[http://www.nlm.nih.gov/hsrinfo/datasites.html|Data, Tools, and Statistics]] * [[http://www.ceh.ac.uk/data/DataSetsandFacilities.html|CEH data sets]] * [[http://www.rene-pickhardt.de/download-trec-text-retrieval-conference-data-set/|Trec (= Text Retrieval Conference) Data Set]] * [[http://cdiac.ornl.gov/epubs/ndp/ndp041/ndp041.html|The Global Historical Climatology Network]] * [[http://www.icwsm.org/2009/data/|ICWSM 2009 Data Challenge]] * [[http://www.kaggle.com/|kaggle - data prediction competitions]] * World City: [[http://www.lboro.ac.uk/gawc/data.html|data]]; [[http://www.lboro.ac.uk/gawc/datasets/da11.html|11]]; * [[http://www.hostip.info/dl/index.html|IP Address Locations + GeoTargeting]] * [[http://netkit-srl.sourceforge.net/data.html|Public data sets that NetKit has been used on]] * http://networks.cs.ucr.edu/ucrchive/measurement.htm * [[http://www.inf.ed.ac.uk/teaching/courses/dme/html/datasets0405.html|datasets selected for the projects for Data Mining and Exploration]] * [[http://www.sociosite.net/databases.php|Social Science Data Archives]] * [[http://perform.wpi.edu/downloads/|Performance of networks]] * https://aws.amazon.com/datasets?_encoding=UTF8&jiveRedirect=1 * http://www.cs.umd.edu/projects/linqs/projects/lbc/index.html * http://www.graph-archive.org/doku.php * http://warsteiner.db.cs.cmu.edu/db-site/Datasets/graphData/ * http://www.datawrangling.com/some-datasets-available-on-the-web * http://mldata.org/ * http://www.kaggle.com/ * http://webscope.sandbox.yahoo.com/catalog.php?datatype=c * http://www.sciencedirect.com/science/article/pii/S1751157711000988 * http://www.medstartr.com/projects/93-phase-ii-next-level-doctor-social-graph * Airports: https://www.msu.edu/~zpneal/research.html * [[http://eschome.net/|Eurosong]]; [[http://en.wikipedia.org/wiki/Eurovision_Song_Contest_2013|2013]] [[pajek:data:2pajek|To transform to Pajek]] * [[test]] ===== APIs ===== * https://github.com/simplegeo/python-oauth2 * http://givealink.org/api_doc * ===== Secondary data ===== * http://federalgovernmentzipcodes.us/