Exploratory data analysis

Detailed program

  1. Introduction: descriptive, normative, predictive, ; Mesurement; sources
  2. Univariate; distributions, basic statistics; visualization, fitting
    Zipfov zakon, B??? zakon, td-tif
  3. data formats: fixed, delimited (CSV), structural (RIS, GED, XML (SGML), JSON, HTML, SVG), download files, collect data from WWW
    Data frame; R
    Data collection, Archiving, Privacy, Quality; metadata, missing data, not available
    The task has its own name - Feature Engineering and it’s a hellishly laborious, manual and painful process. Feature Engineering is by far more impactful on predictive accuracy than anything you can do in the Modelling phase. The much more important Data Preparation process
    Predictive Modelling; Anomaly Detection techniques; Transformations,
  4. Multivariate; normalizations; visualization; Rajski, Fischer
  5. Dissimilarities, clustering; matrix representation
  6. D3.js, ggplot2
  7. Symbolic data analysis; aggregation
  8. Interactive graphics; what else

Z digitalizacijo se veliko zbiranja podatkov seli iz anketiranja v povezavanje podatkov iz različnih podatkovnih baz, v katerih se spremlja tekoče stanje. Likert scale?

Resources

vlado/notes/eda.txt · Last modified: 2018/09/30 23:27 by vlado
 
Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Noncommercial-Share Alike 3.0 Unported
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki