Project 1

Make a data frame

From The World Factbook construct a data frame in which units (rows) are world countries with names from the “book” and variables (columns):

  1. the first variable V1 contains the two-character code of the country ISO - for labels in visualizations;
  2. the second, third, fourth, and fifth variables are as given in the table (select a row from the table and make me a note for confirmation or select your own variables V2, V3, V4, and V5 for which you expect that are somehow related, and make me a note for confirmation);
  3. if you find it useful for your exploration you can add additional variables from the “book”.

To make variables comparable, you can select also derived variables such as (Labor force / Population), (Annual passenger traffic on registered air carriers / Population), or (Total airports / Land area).

For visual inspection the additional variable region (North America, Central America, South America, Europe, Africa, Middle East, Central Asia, South Asia, East & Southeast Asia, Australia & Oceania, Antarctica) would be very useful - it can be constructed from the map on the entry page and region's countries lists on the next level.

n student V1 V2 V3 V4 V5
1 Jiaxuan Wang ISO Physicians density Hospital bed density Total fertility rate Total population life expectancy at birth
2 Alisa Ignatova ISO Birth rate Death rate Net migration rate Urban population %
3 Ekaterina Kibalchich ISO urban population carbon dioxide emissions energy consumption per capita real GDP
4 ISO
5 ISO
6 ISO
7 ISO
8 ISO
9 ISO
10 ISO
11 ISO
12 ISO
13 ISO
14 ISO
15 ISO
16 ISO
17 ISO
18 ISO
19 ISO

Save the created data frame as a CSV file.

Explore the collected data. For visualization on a map see Maps or rworldmap.

Write a report and save it as a PDF file. Put the report and CSV file into a ZIP file and send it to me.

Hint: The factbook data are available as a JSON file at GitHub / Download

> library(jsonlite)
> J <- fromJSON(readLines("factbook.json"))
> str(J,max.level=2)
> J$countries[[4]]$data$name
[1] "Albania"
> J$countries$albania$data$name
[1] "Albania"
> names(J$countries)
> names(J$countries$albania$data)


Example extracting a selected variable from the Factbook.



Students; EDA

ru/hse/eda22/stu/p1.txt · Last modified: 2022/12/15 02:25 by vlado
 
Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Noncommercial-Share Alike 3.0 Unported
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki