Data tables are represented in R as data frames data.frame
- it is essentially a special list
in which all items are vectors (columns, variables) of the same length. A data frame can be created with an assignment of the form:
D <- data.frame(e1, e2, ... , ek, row.names=units)
where units is a vector containing names of units, and ei has either a form ni=vi or vi (ni is the name of the i-th column and vi is the vector of values).
> units <- c('a','b','c','d','e') > u <- c(1,2,3,5,5) > v <- c(1,3,2,3,5) > D <- data.frame(U=u,V=v,row.names=units) > D U V a 1 1 b 2 3 c 3 2 d 5 3 e 5 5 > plot(D) > D$V [1] 1 3 2 3 5 > D[[2]] [1] 1 3 2 3 5
We can get the i-th variable by expressions D$ni or D[[i]].
A data frame D can be created also interactively using
D <- edit(data.frame())
We can also prepare it as an Excel spreadsheet, save it as a CSV file, and read it into R - see the next chapter.
We get a current working directory using the function getwd()
. We change it with the function setwd(
path)
, for example
> getwd() [1] "C:/Users/batagelj/test/ru/R" > setwd("C:/Users/batagelj/work/R/Cluster") > getwd() [1] "C:/Users/batagelj/work/R/Cluster"
It can be changed also using the option Change dir …
in the menu File
.
We can save the values of selected variables to the file save.Rdata
(in the working directory)
using the command
dump(c("v1","v2",. . . ,"vk"),"save.Rdata")
When we need them again we can load them using
source("save.Rdata").
See also the functions save
and load
.
> data(package=.packages(all.available=TRUE)) > data(Pottery) > help(Pottery) > summary(Pottery) Site Al Fe Mg Ca Na AshleyRails: 5 Min. :10.10 Min. :0.920 Min. :0.530 Min. :0.0100 Min. :0.0300 Caldicot : 2 1st Qu.:11.95 1st Qu.:1.700 1st Qu.:0.670 1st Qu.:0.0600 1st Qu.:0.0500 IsleThorns : 5 Median :13.80 Median :5.465 Median :3.825 Median :0.1550 Median :0.1500 Llanedyrn :14 Mean :14.49 Mean :4.468 Mean :3.142 Mean :0.1465 Mean :0.1585 3rd Qu.:17.45 3rd Qu.:6.590 3rd Qu.:4.503 3rd Qu.:0.2150 3rd Qu.:0.2150 Max. :20.80 Max. :7.090 Max. :7.230 Max. :0.3100 Max. :0.5400 > dim(Pottery) [1] 26 6 > library(cluster) > help(ruspini) > data(ruspini) > head(ruspini) > summary(ruspini) > plot(ruspini)
See also data sets: flower, plantTraits, animals, iris, mtcars, etc.
> data(iris) > help(iris) > dim(iris) [1] 150 5 > summary(iris) Sepal.Length Sepal.Width Petal.Length Petal.Width Species Min. :4.300 Min. :2.000 Min. :1.000 Min. :0.100 setosa :50 1st Qu.:5.100 1st Qu.:2.800 1st Qu.:1.600 1st Qu.:0.300 versicolor:50 Median :5.800 Median :3.000 Median :4.350 Median :1.300 virginica :50 Mean :5.843 Mean :3.057 Mean :3.758 Mean :1.199 3rd Qu.:6.400 3rd Qu.:3.300 3rd Qu.:5.100 3rd Qu.:1.800 Max. :7.900 Max. :4.400 Max. :6.900 Max. :2.500 > head(iris) Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa 3 4.7 3.2 1.3 0.2 setosa 4 4.6 3.1 1.5 0.2 setosa 5 5.0 3.6 1.4 0.2 setosa 6 5.4 3.9 1.7 0.4 setosa > tail(iris) Sepal.Length Sepal.Width Petal.Length Petal.Width Species 145 6.7 3.3 5.7 2.5 virginica 146 6.7 3.0 5.2 2.3 virginica 147 6.3 2.5 5.0 1.9 virginica 148 6.5 3.0 5.2 2.0 virginica 149 6.2 3.4 5.4 2.3 virginica 150 5.9 3.0 5.1 1.8 virginica > pairs(iris)