R basic data types

Five R basic classes of objects

There are five basic classes of objects in R: character, numeric, integer, complex, logical. Here are the examples:

c1 <- "Hello World"         # this is character object
c2 <- "CS 133"              # this is also character object
c12 <- paste(c1,c2,sep="")  # combine two character objects, sep is optional
print(c1)

## [1] "Hello World"

print(c2)

## [1] "CS 133"

print(c12)

## [1] "Hello WorldCS 133"

class(c1)                   # get the class type of c1

## [1] "character"

class(c2)

## [1] "character"

class(c12)

## [1] "character"

subc <- substr(c12, start = 2, stop = 5)   # get substring from a character object
print(subc)                 # print the value of subc object

## [1] "ello"

print("Now we are going to show examples of numbers")

## [1] "Now we are going to show examples of numbers"

x <- 10                   # assign a decimal value, this is numeric object
print(x)

## [1] 10

class(x)

## [1] "numeric"

x <- 10L                  # assign a integer value, this is integer object
print(x)

## [1] 10

class(x)

## [1] "integer"

x <- 1/0                  # example of Inf for infinity
print(x)

## [1] Inf

x <- 0/0                  # example of undefined value
print(x)

## [1] NaN

1/Inf                     # example of using infinity

## [1] 0

y <- 1+2i                 # exampe of complex object
y

## [1] 1+2i

class(y)                  # get the class information

## [1] "complex"

a1 <- 10
a2 <- 20
a3 <- 30
z1 <- a1>a2                # z1 is now a logical object
z1

## [1] FALSE

class(z1)

## [1] "logical"

z2 <- a1<a3 
z <- z1 & z2               # logical and operation
z

## [1] FALSE

z <- z1 | z2               # logical or operation
z

## [1] TRUE

z <- !z1                   # logical not operation
z

## [1] TRUE

Examples of vectors

The c() function can be used to create vectors of objects by concatenating things together

x <- c(0.5, 0.6)        ## numeric vector
x

## [1] 0.5 0.6

x <- c(1L,3L)           ## integer vector
x

## [1] 1 3

x <- c(TRUE, FALSE)     ## logical vector 
x

## [1]  TRUE FALSE

x <- c(T, F)            ## logical vector
x

## [1]  TRUE FALSE

x <- c("a", "b", "c")   ## character vector 
x

## [1] "a" "b" "c"

x <- 2:13                 ## integer vector
x

##  [1]  2  3  4  5  6  7  8  9 10 11 12 13

x <- c(1+0i, 2+4i)        ## complex vector
x[0]                       # print the class type of x

## complex(0)

## [1] 1+0i 2+4i

class(x)                 # show the class of variable x

## [1] "complex"

x[1]                   # print the first element of x

## [1] 1+0i

y <- c(1.7, "a")         ## mixing values in vector, in this case, they will be character object vector
y

## [1] "1.7" "a"

y <- c(TRUE, 2)         ## numeric vector
y

## [1] 1 2

y <- c("a", TRUE)       ## character vector
y

## [1] "a"    "TRUE"

x <- c(1,3,5,6,7)
y <- c(2,5,7,3,6)
x - y                   # vector operations

## [1] -1 -2 -2  3  1

x + y

## [1]  3  8 12  9 13

x[2:4]                  # get subset of vector

## [1] 3 5 6

Examples of object convertion

Some times you may want to convert different type of objects in R in order to do calculations, here is example:

x <- c("abc",10)
x

## [1] "abc" "10"

#x[2] + 3                # you will get an error, since 10 is a character object in the vector, so you cannot do the add operation

xn <- as.numeric(x)    # convert x to numeric class

## Warning: NAs introduced by coercion

xn

## [1] NA 10

xi <- as.integer(x)     # convert x to integer class

## Warning: NAs introduced by coercion

xi

## [1] NA 10

xl <- as.logical(x)      # convert x to logical class
xl

## [1] NA NA

xc <- as.character(x)  # convert x to character class
xc

## [1] "abc" "10"

as.numeric(x[2]) + 3 # convert to numeric class and do the calculation

## [1] 13

Examples of lists

Lists are special type of vector that contain elements of different classes. Here is example:

x <- list("abc",10)         # create a list, now the 10 is a numeric object. If you use c function to create list, 10 will be converted to character
x

## [[1]]
## [1] "abc"
## 
## [[2]]
## [1] 10

x[[2]] + 3                  # use [[2]] to get the second element in a list

## [1] 13

x <- c("abc",10)            # if you use a vector
as.numeric(x[2]) + 3        # it will be the same as previous statement

## [1] 13

help(print)                 # getting help of a function if you are not sure how to use it

Examples of Matrices

Matrices are vectors with a dimension attribute. The dimension attribute is itself an integer vector of length 2 (number of rows, number of columns)

m <- matrix(nrow = 2, ncol = 3)          # create matrix with 2 row and 3 column
m

##      [,1] [,2] [,3]
## [1,]   NA   NA   NA
## [2,]   NA   NA   NA

dim(m)               # get dimentions

## [1] 2 3

attributes(m)        # get attribute of m

## $dim
## [1] 2 3

m <- matrix(1:6, nrow = 2, ncol = 3)    # create matrix with values
m

##      [,1] [,2] [,3]
## [1,]    1    3    5
## [2,]    2    4    6

m[1,2]              # get the element at first row and second column

## [1] 3

m <- 1:10           # get vectors
m

##  [1]  1  2  3  4  5  6  7  8  9 10

dim(m)<- c(2,5)     # create matrix directly from vectors by adding dimension attribute
m

##      [,1] [,2] [,3] [,4] [,5]
## [1,]    1    3    5    7    9
## [2,]    2    4    6    8   10

x <- 1:3
y <- 10:12
cbind(x,y)         # create matrix by column binding

##      x  y
## [1,] 1 10
## [2,] 2 11
## [3,] 3 12

rbind(x,y)         # create matrix by row binding

##   [,1] [,2] [,3]
## x    1    2    3
## y   10   11   12

Examples of Factors

Factors are used to represent categorical data and can be unordered or ordered. One can think of a factor as an integer vector where each integer has a label. Factors are important in statistical modeling and are treated specially by modelling functions like lm() and glm(). Using factors with labels is better than using integers because factors are self-describing. Having a variable that has values “Male” and “Female” is better than a variable that has values 1 and 2.

x <- factor(c("yes", "yes", "no", "yes", "no"))   
x

## [1] yes yes no  yes no 
## Levels: no yes

table(x)

## x
##  no yes 
##   2   3

unclass(x)          # See the underlying representation of factor

## [1] 2 2 1 2 1
## attr(,"levels")
## [1] "no"  "yes"

x <- factor(c("yes", "yes", "no", "yes", "no"),  levels <- c("yes", "no"))     # The order of the levels of a factor can be set using the levels argument to factor()
x

## [1] yes yes no  yes no 
## Levels: yes no

Examples of Missing values

Missing values are denoted by NA or NaN for q undefined mathematical operations. is.na() is used to test objects if they are NA is.nan() is used to test for NaN A NaN is also NA but the converse is not true

x <- c(1, 2, NA, 10, 3)    ## Create a vector with NAs in it
is.na(x)

## [1] FALSE FALSE  TRUE FALSE FALSE

is.nan(x)

## [1] FALSE FALSE FALSE FALSE FALSE

x <- c(1, 2, NaN, NA, 4)   ## Now create a vector with both NA and NaN values
is.na(x)

## [1] FALSE FALSE  TRUE  TRUE FALSE

is.nan(x)

## [1] FALSE FALSE  TRUE FALSE FALSE

Examples of Data Frames

Data frames are used to store tabular data in R. Data frames are represented as a special type of list where every element of the list has to have the same length Each element of the list can be thought of as a column and the length of each element of the list is the number of rows. Data frames have a special attribute called row.names which indicate information about each row of the data frame.

x <- data.frame(foo = 1:4, bar = c(T, T, F, F))       # create a data frame
x

##   foo   bar
## 1   1  TRUE
## 2   2  TRUE
## 3   3 FALSE
## 4   4 FALSE

nrow(x)

## [1] 4

ncol(x)

## [1] 2

data.matrix(x)                # convert data frame to a matrix

##      foo bar
## [1,]   1   1
## [2,]   2   1
## [3,]   3   0
## [4,]   4   0

Examples of names

R objects can have names, which is very useful for writing readable code and self-describing objects

x <- 1:3
names(x)

## NULL

names(x) <- c("New York", "Seattle", "Los Angeles")     # set the names for vector x
x

##    New York     Seattle Los Angeles 
##           1           2           3

x <- list("Los Angeles" = 1, Boston = 2, London = 3)    # list can also have names 
x

## $`Los Angeles`
## [1] 1
## 
## $Boston
## [1] 2
## 
## $London
## [1] 3

m <- matrix(1:4, nrow = 2, ncol = 2)
dimnames(m) <- list(c("a", "b"), c("c", "d"))        # Matrices can have both column and row names.
m

##   c d
## a 1 3
## b 2 4

colnames(m) <- c("h", "f")                           # set column names 
rownames(m) <- c("x", "z")                           # set row names
m

##   h f
## x 1 3
## z 2 4