Project 3

Explore a large data set

There are many data sets available: planes (1, 2), bikes, taxi, Kaggle, Data world,European Social Survey, V-dem, Pew research,Food data, …, your own source. Explore selected data set: select variables and explore them (distribution, extreme values, …), explore relations among variables (pairs, clustering, regression, derived quantities, interesting observations), ideas for detailed analyses.

For example, what is the impact of vaccination on the COVID situation in different countries?

The selected data set has to have at least 10000 units or in the case of temporal data set the product Number of units X Number of time points is at least 10000.

Before starting the analysis send me a note about your selection for confirmation.

n student dataset
1 Михаил Родченков Big Five Personality Test / Kaggle
2 Ципес Лев Video Game Sales / Kaggle
3 Крутоголов Дмитрий Андреевич Diamonds / Kaggle
4 Roman Pavlyutin Google Play Store Apps / Kaggle
5 Байрамов Емил Ровшан оглы FilmTV movies dataset / Kaggle
6 Чо Денис Сокмунович Netflix / Kaggle
7 Акинде Мэри Айобами Song Popularity Dataset / Kaggle
8
9
10
11
12
13
14
15
16
17
18
19


Projects; EDA

ru/hse/eda21/stu/p3.txt · Last modified: 2022/01/13 16:19 by vlado
 
Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Noncommercial-Share Alike 3.0 Unported
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki