Project 1

Make a data frame

From The World Factbook construct a data frame in which units (rows) are world countries with names from the “book” and variables (columns):

  1. the first variable V1 contains the two-character code of the country ISO - for labels in visualizations;
  2. the second, third, fourth, and fifth variables are as given in the table (select a row from the table and make me a note for confirmation or select your own variables V2, V3, V4, and V5 for which you expect that are somehow related, and make me a note for confirmation);
  3. if you find it useful for your exploration you can add additional variables from the “book”.

For visual inspection the additional variable region (North America, Central America, South America, Europe, Africa, Middle East, Central Asia, South Asia, East & Southeast Asia, Australia & Oceania, Antarctica) would be very useful - it can be constructed from the map on the entry page and region's countries lists on the next level.

n student V1 V2 V3 V4 V5
1 Лев Ципес ISO Total area Railways total Roadways total # of airports
2 ISO Total of population Annual air passengers # of airports Urban population
3 Роман Павлютин ISO Total of population Telephones - mobile cellular Internet users Broadband - fixed subscriptions
4 ISO GDP per capita Population growth rate Net migration rate Urban population
5 Михаил Родченков ISO GDP real_growth_rate unemployment_rate reserves_of_foreign_exchange_and_gold inflation_rate + natural_gas consumption
6 ISO
7 ISO
8 ISO
9 ISO
10 ISO
11 ISO
12 ISO
13 ISO
14 ISO
15 ISO
16 ISO
17 ISO
18 ISO
19 ISO

Save the created data frame as a CSV file.

Explore the collected data. For visualization on a map see Maps or rworldmap.

Write a report and save it as a PDF file. Put the report and CSV file into a ZIP file and send it to me.

Hint: The factbook data are available as a JSON file at GitHub / Download

> library(jsonlite)
> J <- fromJSON(readLines("factbook.json"))
> str(J,max.level=2)
> J$countries[[4]]$data$name
[1] "Albania"
> J$countries$albania$data$name
[1] "Albania"
> names(J$countries)
> names(J$countries$albania$data)


Example extracting a selected variable from the Factbook.



Students; EDA

ru/hse/eda21/stu/p1.txt · Last modified: 2021/12/08 13:47 by vlado
 
Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Noncommercial-Share Alike 3.0 Unported
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki