====== Data frames ====== ===== Basics about data frames ===== Data tables are represented in R as **data frames** ''data.frame'' - it is essentially a special ''list'' in which all items are vectors (columns, variables) of the same length. A data frame can be created with an assignment of the form: D <- data.frame(e1, e2, ... , ek, row.names=units) where units is a vector containing names of units, and ei has either a form ni=vi or vi (ni is the name of the i-th column and vi is the vector of values). > units <- c('a','b','c','d','e') > u <- c(1,2,3,5,5) > v <- c(1,3,2,3,5) > D <- data.frame(U=u,V=v,row.names=units) > D U V a 1 1 b 2 3 c 3 2 d 5 3 e 5 5 > plot(D) > D$V [1] 1 3 2 3 5 > D[[2]] [1] 1 3 2 3 5 We can get the i-th variable by expressions D$ni or D[[i]]. A data frame D can be created also interactively using D <- edit(data.frame()) We can also prepare it as an Excel spreadsheet, save it as a CSV file, and read it into R - see the next chapter. ==== Test ==== Represent the following data as a data frame {{ru:7iss:labs:data1.jpg?400}} and {{ru:7iss:labs:data2.jpg?400}} [[ru:7iss:labs:s:s2|Solution]] ===== Working directory ===== We get a current working directory using the function ''getwd()''. We change it with the function ''setwd(''//path//'')'', for example > getwd() [1] "C:/Users/batagelj/test/ru/R" > setwd("C:/Users/batagelj/work/R/Cluster") > getwd() [1] "C:/Users/batagelj/work/R/Cluster" It can be changed also using the option ''Change dir ...'' in the menu ''File''. We can save the values of selected variables to the file //save//''.Rdata'' (in the working directory) using the command dump(c("v1","v2",. . . ,"vk"),"save.Rdata") When we need them again we can load them using source("save.Rdata"). See also the functions ''save'' and ''load''. ===== Data from R packages ===== > data(package=.packages(all.available=TRUE)) > data(Pottery) > help(Pottery) > summary(Pottery) Site Al Fe Mg Ca Na AshleyRails: 5 Min. :10.10 Min. :0.920 Min. :0.530 Min. :0.0100 Min. :0.0300 Caldicot : 2 1st Qu.:11.95 1st Qu.:1.700 1st Qu.:0.670 1st Qu.:0.0600 1st Qu.:0.0500 IsleThorns : 5 Median :13.80 Median :5.465 Median :3.825 Median :0.1550 Median :0.1500 Llanedyrn :14 Mean :14.49 Mean :4.468 Mean :3.142 Mean :0.1465 Mean :0.1585 3rd Qu.:17.45 3rd Qu.:6.590 3rd Qu.:4.503 3rd Qu.:0.2150 3rd Qu.:0.2150 Max. :20.80 Max. :7.090 Max. :7.230 Max. :0.3100 Max. :0.5400 > dim(Pottery) [1] 26 6 > library(cluster) > help(ruspini) > data(ruspini) > head(ruspini) > summary(ruspini) > plot(ruspini) See also data sets: flower, plantTraits, animals, iris, mtcars, etc. > data(iris) > help(iris) > dim(iris) [1] 150 5 > summary(iris) Sepal.Length Sepal.Width Petal.Length Petal.Width Species Min. :4.300 Min. :2.000 Min. :1.000 Min. :0.100 setosa :50 1st Qu.:5.100 1st Qu.:2.800 1st Qu.:1.600 1st Qu.:0.300 versicolor:50 Median :5.800 Median :3.000 Median :4.350 Median :1.300 virginica :50 Mean :5.843 Mean :3.057 Mean :3.758 Mean :1.199 3rd Qu.:6.400 3rd Qu.:3.300 3rd Qu.:5.100 3rd Qu.:1.800 Max. :7.900 Max. :4.400 Max. :6.900 Max. :2.500 > head(iris) Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa 3 4.7 3.2 1.3 0.2 setosa 4 4.6 3.1 1.5 0.2 setosa 5 5.0 3.6 1.4 0.2 setosa 6 5.4 3.9 1.7 0.4 setosa > tail(iris) Sepal.Length Sepal.Width Petal.Length Petal.Width Species 145 6.7 3.3 5.7 2.5 virginica 146 6.7 3.0 5.2 2.3 virginica 147 6.3 2.5 5.0 1.9 virginica 148 6.5 3.0 5.2 2.0 virginica 149 6.2 3.4 5.4 2.3 virginica 150 5.9 3.0 5.1 1.8 virginica > pairs(iris) {{ru:7iss:labs:iris.png}} {{ru:7iss:labs:iris.pdf}} \\ \\ [[ru:7iss#labs|Back to 7ISS Labs]]