====== Data frames ======
===== Basics about data frames =====
Data tables are represented in R as **data frames** ''data.frame'' - it is essentially a special ''list'' in which all items are vectors (columns, variables) of the same length. A data frame can be created with an assignment of the form:
D <- data.frame(e1, e2, ... , ek, row.names=units)
where units is a vector containing names of units, and ei has either a form ni=vi or vi (ni is the name of the i-th column and vi is the vector of values).
> units <- c('a','b','c','d','e')
> u <- c(1,2,3,5,5)
> v <- c(1,3,2,3,5)
> D <- data.frame(U=u,V=v,row.names=units)
> D
U V
a 1 1
b 2 3
c 3 2
d 5 3
e 5 5
> plot(D)
> D$V
[1] 1 3 2 3 5
> D[[2]]
[1] 1 3 2 3 5
We can get the i-th variable by expressions D$ni or D[[i]].
A data frame D can be created also interactively using
D <- edit(data.frame())
We can also prepare it as an Excel spreadsheet, save it as a CSV file, and read it into R - see the next chapter.
==== Test ====
Represent the following data as a data frame
{{ru:7iss:labs:data1.jpg?400}}
and
{{ru:7iss:labs:data2.jpg?400}}
[[ru:7iss:labs:s:s2|Solution]]
===== Working directory =====
We get a current working directory using the function ''getwd()''. We change it with the function ''setwd(''//path//'')'', for example
> getwd()
[1] "C:/Users/batagelj/test/ru/R"
> setwd("C:/Users/batagelj/work/R/Cluster")
> getwd()
[1] "C:/Users/batagelj/work/R/Cluster"
It can be changed also using the option ''Change dir ...'' in the menu ''File''.
We can save the values of selected variables to the file //save//''.Rdata'' (in the working directory)
using the command
dump(c("v1","v2",. . . ,"vk"),"save.Rdata")
When we need them again we can load them using
source("save.Rdata").
See also the functions ''save'' and ''load''.
===== Data from R packages =====
> data(package=.packages(all.available=TRUE))
> data(Pottery)
> help(Pottery)
> summary(Pottery)
Site Al Fe Mg Ca Na
AshleyRails: 5 Min. :10.10 Min. :0.920 Min. :0.530 Min. :0.0100 Min. :0.0300
Caldicot : 2 1st Qu.:11.95 1st Qu.:1.700 1st Qu.:0.670 1st Qu.:0.0600 1st Qu.:0.0500
IsleThorns : 5 Median :13.80 Median :5.465 Median :3.825 Median :0.1550 Median :0.1500
Llanedyrn :14 Mean :14.49 Mean :4.468 Mean :3.142 Mean :0.1465 Mean :0.1585
3rd Qu.:17.45 3rd Qu.:6.590 3rd Qu.:4.503 3rd Qu.:0.2150 3rd Qu.:0.2150
Max. :20.80 Max. :7.090 Max. :7.230 Max. :0.3100 Max. :0.5400
> dim(Pottery)
[1] 26 6
> library(cluster)
> help(ruspini)
> data(ruspini)
> head(ruspini)
> summary(ruspini)
> plot(ruspini)
See also data sets: flower, plantTraits, animals, iris, mtcars, etc.
> data(iris)
> help(iris)
> dim(iris)
[1] 150 5
> summary(iris)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
Min. :4.300 Min. :2.000 Min. :1.000 Min. :0.100 setosa :50
1st Qu.:5.100 1st Qu.:2.800 1st Qu.:1.600 1st Qu.:0.300 versicolor:50
Median :5.800 Median :3.000 Median :4.350 Median :1.300 virginica :50
Mean :5.843 Mean :3.057 Mean :3.758 Mean :1.199
3rd Qu.:6.400 3rd Qu.:3.300 3rd Qu.:5.100 3rd Qu.:1.800
Max. :7.900 Max. :4.400 Max. :6.900 Max. :2.500
> head(iris)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
> tail(iris)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
145 6.7 3.3 5.7 2.5 virginica
146 6.7 3.0 5.2 2.3 virginica
147 6.3 2.5 5.0 1.9 virginica
148 6.5 3.0 5.2 2.0 virginica
149 6.2 3.4 5.4 2.3 virginica
150 5.9 3.0 5.1 1.8 virginica
> pairs(iris)
{{ru:7iss:labs:iris.png}}
{{ru:7iss:labs:iris.pdf}}
\\ \\
[[ru:7iss#labs|Back to 7ISS Labs]]