There are five basic classes of objects in R: character, numeric, integer, complex, logical. Here are the examples:
c1 <- "Hello World" # this is character object
c2 <- "CS 133" # this is also character object
c12 <- paste(c1,c2,sep="") # combine two character objects, sep is optional
print(c1)
## [1] "Hello World"
print(c2)
## [1] "CS 133"
print(c12)
## [1] "Hello WorldCS 133"
class(c1) # get the class type of c1
## [1] "character"
class(c2)
## [1] "character"
class(c12)
## [1] "character"
subc <- substr(c12, start = 2, stop = 5) # get substring from a character object
print(subc) # print the value of subc object
## [1] "ello"
print("Now we are going to show examples of numbers")
## [1] "Now we are going to show examples of numbers"
x <- 10 # assign a decimal value, this is numeric object
print(x)
## [1] 10
class(x)
## [1] "numeric"
x <- 10L # assign a integer value, this is integer object
print(x)
## [1] 10
class(x)
## [1] "integer"
x <- 1/0 # example of Inf for infinity
print(x)
## [1] Inf
x <- 0/0 # example of undefined value
print(x)
## [1] NaN
1/Inf # example of using infinity
## [1] 0
y <- 1+2i # exampe of complex object
y
## [1] 1+2i
class(y) # get the class information
## [1] "complex"
a1 <- 10
a2 <- 20
a3 <- 30
z1 <- a1>a2 # z1 is now a logical object
z1
## [1] FALSE
class(z1)
## [1] "logical"
z2 <- a1<a3
z <- z1 & z2 # logical and operation
z
## [1] FALSE
z <- z1 | z2 # logical or operation
z
## [1] TRUE
z <- !z1 # logical not operation
z
## [1] TRUE
The c() function can be used to create vectors of objects by concatenating things together
x <- c(0.5, 0.6) ## numeric vector
x
## [1] 0.5 0.6
x <- c(1L,3L) ## integer vector
x
## [1] 1 3
x <- c(TRUE, FALSE) ## logical vector
x
## [1] TRUE FALSE
x <- c(T, F) ## logical vector
x
## [1] TRUE FALSE
x <- c("a", "b", "c") ## character vector
x
## [1] "a" "b" "c"
x <- 2:13 ## integer vector
x
## [1] 2 3 4 5 6 7 8 9 10 11 12 13
x <- c(1+0i, 2+4i) ## complex vector
x[0] # print the class type of x
## complex(0)
x
## [1] 1+0i 2+4i
class(x) # show the class of variable x
## [1] "complex"
x[1] # print the first element of x
## [1] 1+0i
y <- c(1.7, "a") ## mixing values in vector, in this case, they will be character object vector
y
## [1] "1.7" "a"
y <- c(TRUE, 2) ## numeric vector
y
## [1] 1 2
y <- c("a", TRUE) ## character vector
y
## [1] "a" "TRUE"
x <- c(1,3,5,6,7)
y <- c(2,5,7,3,6)
x - y # vector operations
## [1] -1 -2 -2 3 1
x + y
## [1] 3 8 12 9 13
x[2:4] # get subset of vector
## [1] 3 5 6
Some times you may want to convert different type of objects in R in order to do calculations, here is example:
x <- c("abc",10)
x
## [1] "abc" "10"
#x[2] + 3 # you will get an error, since 10 is a character object in the vector, so you cannot do the add operation
xn <- as.numeric(x) # convert x to numeric class
## Warning: NAs introduced by coercion
xn
## [1] NA 10
xi <- as.integer(x) # convert x to integer class
## Warning: NAs introduced by coercion
xi
## [1] NA 10
xl <- as.logical(x) # convert x to logical class
xl
## [1] NA NA
xc <- as.character(x) # convert x to character class
xc
## [1] "abc" "10"
as.numeric(x[2]) + 3 # convert to numeric class and do the calculation
## [1] 13
Lists are special type of vector that contain elements of different classes. Here is example:
x <- list("abc",10) # create a list, now the 10 is a numeric object. If you use c function to create list, 10 will be converted to character
x
## [[1]]
## [1] "abc"
##
## [[2]]
## [1] 10
x[[2]] + 3 # use [[2]] to get the second element in a list
## [1] 13
x <- c("abc",10) # if you use a vector
as.numeric(x[2]) + 3 # it will be the same as previous statement
## [1] 13
help(print) # getting help of a function if you are not sure how to use it
Matrices are vectors with a dimension attribute. The dimension attribute is itself an integer vector of length 2 (number of rows, number of columns)
m <- matrix(nrow = 2, ncol = 3) # create matrix with 2 row and 3 column
m
## [,1] [,2] [,3]
## [1,] NA NA NA
## [2,] NA NA NA
dim(m) # get dimentions
## [1] 2 3
attributes(m) # get attribute of m
## $dim
## [1] 2 3
m <- matrix(1:6, nrow = 2, ncol = 3) # create matrix with values
m
## [,1] [,2] [,3]
## [1,] 1 3 5
## [2,] 2 4 6
m[1,2] # get the element at first row and second column
## [1] 3
m <- 1:10 # get vectors
m
## [1] 1 2 3 4 5 6 7 8 9 10
dim(m)<- c(2,5) # create matrix directly from vectors by adding dimension attribute
m
## [,1] [,2] [,3] [,4] [,5]
## [1,] 1 3 5 7 9
## [2,] 2 4 6 8 10
x <- 1:3
y <- 10:12
cbind(x,y) # create matrix by column binding
## x y
## [1,] 1 10
## [2,] 2 11
## [3,] 3 12
rbind(x,y) # create matrix by row binding
## [,1] [,2] [,3]
## x 1 2 3
## y 10 11 12
Factors are used to represent categorical data and can be unordered or ordered. One can think of a factor as an integer vector where each integer has a label. Factors are important in statistical modeling and are treated specially by modelling functions like lm() and glm(). Using factors with labels is better than using integers because factors are self-describing. Having a variable that has values “Male” and “Female” is better than a variable that has values 1 and 2.
x <- factor(c("yes", "yes", "no", "yes", "no"))
x
## [1] yes yes no yes no
## Levels: no yes
table(x)
## x
## no yes
## 2 3
unclass(x) # See the underlying representation of factor
## [1] 2 2 1 2 1
## attr(,"levels")
## [1] "no" "yes"
x <- factor(c("yes", "yes", "no", "yes", "no"), levels <- c("yes", "no")) # The order of the levels of a factor can be set using the levels argument to factor()
x
## [1] yes yes no yes no
## Levels: yes no
Missing values are denoted by NA or NaN for q undefined mathematical operations. is.na() is used to test objects if they are NA is.nan() is used to test for NaN A NaN is also NA but the converse is not true
x <- c(1, 2, NA, 10, 3) ## Create a vector with NAs in it
is.na(x)
## [1] FALSE FALSE TRUE FALSE FALSE
is.nan(x)
## [1] FALSE FALSE FALSE FALSE FALSE
x <- c(1, 2, NaN, NA, 4) ## Now create a vector with both NA and NaN values
is.na(x)
## [1] FALSE FALSE TRUE TRUE FALSE
is.nan(x)
## [1] FALSE FALSE TRUE FALSE FALSE
Data frames are used to store tabular data in R. Data frames are represented as a special type of list where every element of the list has to have the same length Each element of the list can be thought of as a column and the length of each element of the list is the number of rows. Data frames have a special attribute called row.names which indicate information about each row of the data frame.
x <- data.frame(foo = 1:4, bar = c(T, T, F, F)) # create a data frame
x
## foo bar
## 1 1 TRUE
## 2 2 TRUE
## 3 3 FALSE
## 4 4 FALSE
nrow(x)
## [1] 4
ncol(x)
## [1] 2
data.matrix(x) # convert data frame to a matrix
## foo bar
## [1,] 1 1
## [2,] 2 1
## [3,] 3 0
## [4,] 4 0
R objects can have names, which is very useful for writing readable code and self-describing objects
x <- 1:3
names(x)
## NULL
names(x) <- c("New York", "Seattle", "Los Angeles") # set the names for vector x
x
## New York Seattle Los Angeles
## 1 2 3
x <- list("Los Angeles" = 1, Boston = 2, London = 3) # list can also have names
x
## $`Los Angeles`
## [1] 1
##
## $Boston
## [1] 2
##
## $London
## [1] 3
m <- matrix(1:4, nrow = 2, ncol = 2)
dimnames(m) <- list(c("a", "b"), c("c", "d")) # Matrices can have both column and row names.
m
## c d
## a 1 3
## b 2 4
colnames(m) <- c("h", "f") # set column names
rownames(m) <- c("x", "z") # set row names
m
## h f
## x 1 3
## z 2 4