====== 1199. Sredin seminar: gradiva ====== * [[http://www.cloudera.com/what-is-hadoop/|hadoop]]; [[http://hadoop.apache.org/mapreduce/|MapReduce]]; [[http://en.wikipedia.org/wiki/MapReduce]]; Ullman: [[http://infolab.stanford.edu/~ullman/mmds.html|Mining of Massive Datasets]]; [[http://infolab.stanford.edu/~ullman/pub/join-mr.pdf|paper]] * [[http://en.wikipedia.org/wiki/Apache_Hadoop]]; [[http://en.wikipedia.org/wiki/MapReduce]]; * [[http://en.wikipedia.org/wiki/Functional_programming]]; [[http://pguides.net/python/functional-programming|map,reduce,filter]]; [[http://www.ibm.com/developerworks/linux/library/l-prog/index.html|py1]]; [[http://docs.python.org/release/3.1.3/howto/functional.html|fp in python]]; [[http://blog.dhananjaynene.com/2010/02/functional-programming-with-python-part-1/|1]]; [[http://blog.dhananjaynene.com/2010/03/functional-programming-with-python-%E2%80%93-part-2-useful-python-constructs/|2]] * [[http://news.cnet.com/8301-10784_3-9955184-7.html|Google spotlights data center inner workings]] * Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung: [[http://www.cs.brown.edu/courses/cs295-11/2006/gfs.pdf|The Google File System]]; [[http://osnet.inm.nchu.edu.tw/powpoint/seminar/2008/The_Google_File_System.pdf|slides]] * Jeffrey Dean, Sanjay Ghemawat: [[http://usenix.org/events/osdi04/tech/full_papers/dean/dean.pdf|MapReduce: Simplified Data Processing on Large Clusters]] * Ralf Lammel: [[http://logic.csci.unt.edu/tarau/teaching/SoftDev/docs/mapReduce.pdf|Google’s MapReduce programming model—Revisited]] * Jeff Ullman: [[http://infolab.stanford.edu/~ullman/mmds.html|Mining of Massive Datasets]]; [[http://infolab.stanford.edu/~ullman/mining/2009/mapreduce.pdf|slides 1]]; [[http://infolab.stanford.edu/~ullman/mining/2009/map-reduce2.pdf|slides 2]] * Jimmy Lin, Chris Dyer: [[http://www.umiacs.umd.edu/~jimmylin/MapReduce-book-final.pdf|Data-Intensive Text Processing with MapReduce]] * [[http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//pubs/archive/36249.pdf|MapReduce, The Programming Model and Practice]], tutorial * [[http://www.dataspora.com/2011/04/pigs-bees-and-elephants-a-comparison-of-eight-mapreduce-languages/|Pigs, Bees, and Elephants: A Comparison of Eight MapReduce Languages]] ===== mix ===== * J. Ullman: [[http://infolab.stanford.edu/~ullman/pub/edbt11.ppt|Map-Reduce and Its Children]], edbt'11 * http://www.cs.ucsb.edu/~gilbert/cs140Win2008/slides/mapReduceLecture.ppt * http://code.google.com/edu/parallel/mapreduce-tutorial.html * http://openmymind.net/2011/1/20/Understanding-Map-Reduce/ * http://ayende.com/blog/4435/map-reduce-a-visual-explanation * http://web.cs.wpi.edu/~cs4513/d08/OtherStuff/MapReduce-TeamA.ppt * http://horicky.blogspot.com/2010/08/designing-algorithmis-for-map-reduce.html * http://www.eurecom.fr/~michiard/teaching/webtech/tutorial.pdf * http://www.unixer.de/publications/img/hoefler-map-reduce-mpi.pdf * http://www.yourdailygeekery.com/2011/05/16/top-k-with-mapreduce.html * http://atbrox.com/2011/05/16/mapreduce-hadoop-algorithms-in-academic-papers-4th-update-may-2011/ * http://www.edbt.org/Proceedings/2011-Uppsala/papers/edbt/a1-afrati.pdf * http://www.sidsuri.com/About_Sid_files/SidSuriRS.pdf * http://www.ece.rutgers.edu/~parashar/Classes/07-08/ece572/readings/mapred_tutorial.pdf * http://code.google.com/p/hadoop-map-reduce-examples/wiki/Anagram_Example * http://www.cs.brown.edu/courses/csci2950-u/f11/schedule.html ===== python ===== * http://docs.python.org/release/3.1.3/howto/functional.html * http://www.michael-noll.com/tutorials/writing-an-hadoop-mapreduce-program-in-python/ * http://atbrox.com/2010/02/08/parallel-machine-learning-for-hadoopmapreduce-a-python-example/ * http://code.activestate.com/recipes/577676-dirt-simple-mapreduce/ * http://mikecvet.wordpress.com/2010/07/02/parallel-mapreduce-in-python/ * http://ebiquity.umbc.edu/blogger/2009/01/02/octopy-quick-and-easy-mapreduce-for-python/ * http://remembersaurus.com/mincemeatpy/ * http://discoproject.org/ * http://clouddbs.blogspot.com/2010/10/googles-mapreduce-in-98-lines-of-python.html * http://blog.doughellmann.com/2009/04/implementing-mapreduce-with.html * http://brandynwhite.com/hadoopy-cython-based-mapreduce-library-for-py * http://code.google.com/p/octopy/ ===== R ===== * CRAN-task: [[http://cran.r-project.org/web/views/HighPerformanceComputing.html|High Performance Computing]] * R: [[http://www.ats.ucla.edu/stat/r/library/advanced_function_r.htm|*apply]]; [[http://stackoverflow.com/questions/3505701/r-grouping-functions-sapply-vs-lapply-vs-apply-vs-tapply-vs-by-vs-aggrega|2]] * Package: [[http://cran.r-project.org/web/packages/mapReduce/mapReduce.pdf|mapReduce]]; [[http://cran.r-project.org/web/packages/mapReduce/index.html|page]] * [[http://blog.revolutionanalytics.com/2011/09/mapreduce-hadoop-r.html|How to program MapReduce jobs in Hadoop with R]] * [[https://github.com/RevolutionAnalytics/RHadoop/wiki|RHadoop]]; [[https://github.com/RevolutionAnalytics/RHadoop/wiki/rmr|rmr]]; [[https://github.com/RevolutionAnalytics/RHadoop/wiki/Tutorial|Map Reduce in R]]; [[https://github.com/RevolutionAnalytics/RHadoop/wiki/Comparison-of-high-level-languages-for-mapreduce%3A-k-means|k-means]]; [[https://github.com/RevolutionAnalytics/RHadoop/wiki/Fast-k-means|fast]] * RHIPE: [[http://ml.stat.purdue.edu/rhipe/|R and Hadoop Integrated Programming Environment]] * [[https://www.rmetrics.org/files/Meielisalp2009/Presentations/Lewis.pdf|for each]] ===== networks ===== * http://scienceblogs.com/goodmath/2007/10/making_graph_algorithms_fast_u.php * http://www.cse.usf.edu/~anda/CIS6930-S11/papers/graph-processing-w-mapreduce.pdf * http://dimacs.rutgers.edu/Workshops/Parallel/slides/suri.pdf * http://horicky.blogspot.com/2010/07/graph-processing-in-map-reduce.html * http://horicky.blogspot.com/2010/07/google-pregel-graph-processing.html * http://www.umiacs.umd.edu/~jimmylin/publications/Lin_Schatz_MLG2010.pdf * http://www.johnandcailin.com/blog/cailin/breadth-first-graph-search-using-iterative-map-reduce-algorithm * http://felix-halim.net/research/maxflow/index.php * http://www.quora.com/Social-Network-Analysis/How-do-I-order-the-nodes-of-a-social-network-to-get-the-best-locality-when-running-map-reduce-graph-algorithms * http://www.ntu.edu.sg/home/bshe/sigmod10_demo.pdf * http://kowshik.github.com/JPregel/pregel_paper.pdf * http://www.ml.cmu.edu/research/dap-papers/tsourakakisdap.pdf * http://theory.stanford.edu/~sergei/papers/spaa11-matchings.pdf * http://www.cs.purdue.edu/homes/kkambatl/papers/tr-graph-analysis.pdf * http://www.cloudera.com/blog/2010/11/do-the-schimmy-efficient-large-scale-graph-analysis-with-hadoop-part-2/ * http://www.jofcis.com/publishedpapers/2011_7_7_2267_2276.pdf * http://www.cc.gatech.edu/~bader/papers/HybridMapReduce-MTAAP2010.pdf * http://www.cis.upenn.edu/~mkse212/slides/11-GraphAlgorithms.pptx * http://cs.ua.edu/691Vrbsky/2011/Slides/MRAlgs.pptx * http://www.sandia.gov/~sjplimp/papers/pc11.pdf [[vlado:pub:sreda|Sreda]]: [[vlado:pub:sreda:1199]]