Data are the most basic ingredients used in "data analysis". R supports a wide variety of data types including scalars, vectors, matrices, data frames, and lists. In this tutorial, we will go over some commonly used data types and briefly cover the idea of "Object" in the end.
Scalars
In computer programming, scalar refers to an atomic quantity that can hold only one value at a time. Scalars are the most basic data types that can be used to construct more complex ones. Let's take a look of some common types of scalars with simple R commands.
Number
> x <- 1
> y <- 2.5
> class(x)
[1] "numeric"
> class(y)
[1] "numeric"
> class(x+y)
[1] "numeric"
> y <- 2.5
> class(x)
[1] "numeric"
> class(y)
[1] "numeric"
> class(x+y)
[1] "numeric"
Logical value
> m & n # AND
[1] FALSE
> m | n # OR
[1] TRUE
> !m # Negation
[1] TRUE
[1] FALSE
> m | n # OR
[1] TRUE
> !m # Negation
[1] TRUE
Character(string)
> a <- "1"; b <- "2.5" # Are they different from x and y we used earlier?
> a;b
[1] "1"
[1] "2.5"
> a+b # a+b=3.5?
Error in a + b : non-numeric argument to binary operator
> class(a)
[1] "character"
> class(as.numeric(a)) # but you can coerce this character into a number
[1] "numeric"
> class(as.character(x)) # vice resa
[1] "character"
> a;b
[1] "1"
[1] "2.5"
> a+b # a+b=3.5?
Error in a + b : non-numeric argument to binary operator
> class(a)
[1] "character"
> class(as.numeric(a)) # but you can coerce this character into a number
[1] "numeric"
> class(as.character(x)) # vice resa
[1] "character"
Vector
A vector is a sequence of data elements of the same basic type.
> o <- c(1,2,5.3,6,-2,4) # Numeric vector
> p <- c("one","two","three","four","five","six") # Character vector
> q <- c(TRUE,TRUE,FALSE,TRUE,FALSE,TRUE) # Logical vector
> o;p;q
[1] 1.0 2.0 5.3 6.0 -2.0 4.0
[1] "one" "two" "three" "four" "five" "six"
[1] TRUE TRUE FALSE TRUE FALSE
> p <- c("one","two","three","four","five","six") # Character vector
> q <- c(TRUE,TRUE,FALSE,TRUE,FALSE,TRUE) # Logical vector
> o;p;q
[1] 1.0 2.0 5.3 6.0 -2.0 4.0
[1] "one" "two" "three" "four" "five" "six"
[1] TRUE TRUE FALSE TRUE FALSE
We talked about component extraction briefly in our first tutorial. Here are some other fun ways of doing that.
| > o[q] # Logical vector can be used to extract vector components [1] 1 2 6 4 > names(o) <- p # Give each component a name > o one two three four five six 1.0 2.0 5.3 6.0 -2.0 4.0 > o["three"] # Extract your components by "calling" their names three 5.3 Matrix
A matrix is a collection of data elements arranged in a two-dimensional rectangular layout. Same as vector, the components in a matrix must be of the same basic type. The following is an example of a matrix with 4 rows and 3 columns.
> t <- matrix(
+ 1:12, # the data components (Don't type "+"!) + nrow=4, # number of rows + ncol=3, # number of columns + byrow = FALSE) # fill matrix by columns > t # print the matrix [,1] [,2] [,3] [1,] 1 5 9 [2,] 2 6 10 [3,] 3 7 11 [4,] 4 8 12
Similar to vectors, matrices also use [] to reference elements.
> t[2,3] # component at 2nd row and 3rd column [1] 10 > t[,3] # 3rd column of matrix [1] 9 10 11 12 > t[4,] # 4th row of matrix [1] 4 8 12 > t[2:4,1:3] # rows 2,3,4 of columns 1,2,3 [,1] [,2] [,3] [1,] 2 6 10 [2,] 3 7 11 [3,] 4 8 12 Data Frame
A data frame is more general than a matrix, in that different columns can have different basic data types. Data frame is the most common data type we are going to use in this class.
|
No comments:
Post a Comment