====== Exploratory data analysis ====== [[ru:hse:eda|HSE Exploratory data analysis ]]; [[ru:hse:eda:arc|EDA archive]] ===== Detailed program ===== - Introduction: descriptive, normative, predictive, ; Mesurement; sources - Univariate; distributions, basic statistics; visualization, fitting\\ Zipfov zakon, B??? zakon, td-tif - data formats: fixed, delimited (CSV), structural (RIS, GED, XML (SGML), JSON, HTML, SVG), download files, collect data from WWW\\ Data frame; R\\ Data collection, Archiving, Privacy, Quality; metadata, missing data, not available\\ The task has its own name - Feature Engineering and it’s a hellishly laborious, manual and painful process. Feature Engineering is by far more impactful on predictive accuracy than anything you can do in the Modelling phase. The much more important Data Preparation process\\ Predictive Modelling; Anomaly Detection techniques; Transformations, - Multivariate; normalizations; visualization; Rajski, Fischer - Dissimilarities, clustering; matrix representation - D3.js, ggplot2 - Symbolic data analysis; aggregation - Interactive graphics; what else Z digitalizacijo se veliko zbiranja podatkov seli iz anketiranja v povezavanje podatkov iz različnih podatkovnih baz, v katerih se spremlja tekoče stanje. Likert scale? ===== Resources ===== * [[notes:da:eda:ref|Bibliography]] * [[notes:da:eda:vid|Video]] * [[notes:da:eda:url|URLs]] [[notes:datana]], [[notes:da:ref|Bibliography]]