Homework 5

Working with matrices, lists, and data frames

Assign to the variable n_dims a single random integer between 3 and 10. Create a vector of consecutive integers from 1 to n_dims^2

n_dims <- runif(n=1, min=3, max=10)
my_vec <- seq(from=1, to=n_dims^2)
head(my_vec)

## [1] 1 2 3 4 5 6

Use the sample function to randomly reshuffle these values.

new_vec <- sample(x=my_vec)
print(new_vec)

##  [1] 35 71 65 49 20 63 47  7 18 88 83 73 22 80 55 34  8 85 37  6 11 82 59 70 75
## [26]  2 84 67 15  4 42 50 10 87 54 40 69  5 36 26 12 14 16 25 13 68 77 38 33 52
## [51] 62 46  1 61 79 74 51 76 41 48 43  3 64 23 31 44 58 53 89 30 81 72 60 56  9
## [76] 86 27 39 29 78 28 21 19 24 57 66 45 32 17

Create a square matrix with these elements.

m <- matrix(data=new_vec, nrow=n_dims, ncol=n_dims)

## Warning in matrix(data = new_vec, nrow = n_dims, ncol = n_dims): data length
## [89] is not a sub-multiple or multiple of the number of rows [9]

Print out the matrix.

print(m)

##       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
##  [1,]   35   88   37   67   69   68   79   23   60
##  [2,]   71   83    6   15    5   77   74   31   56
##  [3,]   65   73   11    4   36   38   51   44    9
##  [4,]   49   22   82   42   26   33   76   58   86
##  [5,]   20   80   59   50   12   52   41   53   27
##  [6,]   63   55   70   10   14   62   48   89   39
##  [7,]   47   34   75   87   16   46   43   30   29
##  [8,]    7    8    2   54   25    1    3   81   78
##  [9,]   18   85   84   40   13   61   64   72   28

Find a function in r to transpose the matrix. Print it out again and note how it has changed.

tm <- t(m)
print(tm)

##       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
##  [1,]   35   71   65   49   20   63   47    7   18
##  [2,]   88   83   73   22   80   55   34    8   85
##  [3,]   37    6   11   82   59   70   75    2   84
##  [4,]   67   15    4   42   50   10   87   54   40
##  [5,]   69    5   36   26   12   14   16   25   13
##  [6,]   68   77   38   33   52   62   46    1   61
##  [7,]   79   74   51   76   41   48   43    3   64
##  [8,]   23   31   44   58   53   89   30   81   72
##  [9,]   60   56    9   86   27   39   29   78   28

Calculate the sum and the mean of the elements in the first row and then the last row.

sum(tm[1,])

## [1] 375

mean(tm[1,])

## [1] 41.66667

sum(tm[n_dims,])

## [1] 412

mean(tm[n_dims,])

## [1] 45.77778

Read about the eigen() function and use it on your matrix

eigen(tm)

## eigen() decomposition
## $values
## [1] 410.968762+ 0.00000i  78.638010+ 0.00000i -29.374061+56.10170i
## [4] -29.374061-56.10170i -42.236126+16.28036i -42.236126-16.28036i
## [7]  44.160076+ 0.00000i   3.226763+28.19350i   3.226763-28.19350i
## 
## $vectors
##                [,1]            [,2]                      [,3]
##  [1,] -0.3083029+0i -0.331518117+0i -0.095184759-0.394406740i
##  [2,] -0.4085802+0i -0.222988768+0i -0.099050261-0.006346444i
##  [3,] -0.3227800+0i  0.113563988+0i  0.612216669+0.000000000i
##  [4,] -0.2897126+0i  0.264464565+0i -0.173168599+0.304224321i
##  [5,] -0.1686343+0i  0.009174565+0i -0.388142243-0.009026475i
##  [6,] -0.3473150+0i -0.383465189+0i  0.057626640+0.066986131i
##  [7,] -0.3783007+0i -0.152615169+0i  0.152772449-0.036532821i
##  [8,] -0.3819648+0i  0.634544372+0i  0.108707705-0.239735173i
##  [9,] -0.3343911+0i  0.429494806+0i  0.009398433+0.274929692i
##                            [,4]                    [,5]                    [,6]
##  [1,] -0.095184759+0.394406740i  0.36096188-0.19511562i  0.36096188+0.19511562i
##  [2,] -0.099050261+0.006346444i -0.25253554+0.14211319i -0.25253554-0.14211319i
##  [3,]  0.612216669+0.000000000i  0.14072626+0.03056449i  0.14072626-0.03056449i
##  [4,] -0.173168599-0.304224321i -0.35677177+0.31040641i -0.35677177-0.31040641i
##  [5,] -0.388142243+0.009026475i -0.52838925+0.00000000i -0.52838925+0.00000000i
##  [6,]  0.057626640-0.066986131i -0.01126115+0.05340200i -0.01126115-0.05340200i
##  [7,]  0.152772449+0.036532821i  0.10678814-0.14560636i  0.10678814+0.14560636i
##  [8,]  0.108707705+0.239735173i  0.10154324-0.06881902i  0.10154324+0.06881902i
##  [9,]  0.009398433-0.274929692i  0.39495995-0.13217635i  0.39495995+0.13217635i
##                 [,7]                     [,8]                     [,9]
##  [1,] -0.25231262+0i -0.058467900+0.25089411i -0.058467900-0.25089411i
##  [2,] -0.31075002+0i -0.077136271-0.36692498i -0.077136271+0.36692498i
##  [3,]  0.39025147+0i  0.115718719+0.14092596i  0.115718719-0.14092596i
##  [4,]  0.47495975+0i  0.188473853-0.07468343i  0.188473853+0.07468343i
##  [5,]  0.15261269+0i  0.673079258+0.00000000i  0.673079258+0.00000000i
##  [6,] -0.62646779+0i -0.216456507+0.20283402i -0.216456507-0.20283402i
##  [7,]  0.08053782+0i -0.235355155+0.03556831i -0.235355155-0.03556831i
##  [8,]  0.11292842+0i  0.002161857-0.02502693i  0.002161857+0.02502693i
##  [9,]  0.16400642+0i -0.293021886-0.18579020i -0.293021886+0.18579020i

Look carefully at the elements of values and vectors in the output. What kind of numbers are these? Dig in with the typeof() function to figure out their type.

typeof(eigen(tm)$values)

## [1] "complex"

typeof(eigen(tm)$vectors)

## [1] "complex"

These are complex numbers.

If have set your code up properly, you should be able to re-run it and create a matrix of different size because n_dims will change.

Create a list with the following named elements:

my_matrix, which is a 4 x 4 matrix filled with random uniform values
my_logical which is a 100-element vector of TRUE or FALSE values. Do this efficiently by setting up a vector of random values and then applying an inequality to it.
my_letters, which is a 26-element vector of all the lower-case letters in random order.

my_matrix <- matrix(data=runif(16), nrow=4, ncol=4)
my_logical <- runif(100)<0.5
my_letters <- sample(letters)
my_list <- list(my_matrix, my_logical, my_letters)

Then, complete the following steps:

Create a new list, which has the element[2,2] from the matrix, the second element of the logical vector, and the second element of the letters vector.

new_list <- list(my_matrix[2,2], my_logical[2], my_letters[2])

Use the typeof() function to confirm the underlying data types of each component in this list

typeof(new_list[[1]])

## [1] "double"

typeof(new_list[[2]])

## [1] "logical"

typeof(new_list[[3]])

## [1] "character"

Combine the underlying elements from the new list into a single atomic vector with the c() function. What is the data type of this vector?

atomic_vector <- c(new_list[[1]], new_list[[2]], new_list[[3]])
print(atomic_vector)

## [1] "0.994857588084415" "FALSE"             "a"

typeof(atomic_vector)

## [1] "character"

Create a data frame with the two variables (= columns) and 26 cases (= rows) below:

Call the first variable my_unis and fill it with 26 random uniform values from 0 to 10
Call the second variable my_letters and fill it with 26 capital letters in random order.

my_unis <- runif(n=26, min=0, max=10)
my_letters <- sample(LETTERS)
data_frame <- data.frame(my_unis, my_letters)
print(data_frame)

##      my_unis my_letters
## 1  7.8547594          K
## 2  0.9796932          W
## 3  5.3713267          U
## 4  5.6226966          E
## 5  8.8405213          L
## 6  6.8492564          P
## 7  3.2006808          H
## 8  8.6874164          X
## 9  3.5056613          V
## 10 4.9680235          M
## 11 8.9918071          D
## 12 5.2723138          C
## 13 1.2043848          N
## 14 6.3615537          Y
## 15 8.2470004          J
## 16 3.0854342          B
## 17 7.7935806          S
## 18 0.7072426          O
## 19 3.8622090          I
## 20 6.1442755          Q
## 21 7.0761349          F
## 22 2.6729233          R
## 23 6.5265862          G
## 24 2.1404349          Z
## 25 8.8508021          T
## 26 6.7541962          A

For the first variable, use a single line of code in R to select 4 random rows and replace the numerical values in those rows with NA.

data_frame[ ,1] <- my_unis
print(data_frame)

##      my_unis my_letters
## 1  7.8547594          K
## 2  0.9796932          W
## 3  5.3713267          U
## 4  5.6226966          E
## 5  8.8405213          L
## 6  6.8492564          P
## 7  3.2006808          H
## 8  8.6874164          X
## 9  3.5056613          V
## 10 4.9680235          M
## 11 8.9918071          D
## 12 5.2723138          C
## 13 1.2043848          N
## 14 6.3615537          Y
## 15 8.2470004          J
## 16 3.0854342          B
## 17 7.7935806          S
## 18 0.7072426          O
## 19 3.8622090          I
## 20 6.1442755          Q
## 21 7.0761349          F
## 22 2.6729233          R
## 23 6.5265862          G
## 24 2.1404349          Z
## 25 8.8508021          T
## 26 6.7541962          A

data_frame[sample(x=1:26, size=4, replace=TRUE),1] <- NA
print(data_frame)

##      my_unis my_letters
## 1  7.8547594          K
## 2  0.9796932          W
## 3  5.3713267          U
## 4  5.6226966          E
## 5  8.8405213          L
## 6  6.8492564          P
## 7  3.2006808          H
## 8  8.6874164          X
## 9  3.5056613          V
## 10 4.9680235          M
## 11 8.9918071          D
## 12 5.2723138          C
## 13        NA          N
## 14 6.3615537          Y
## 15 8.2470004          J
## 16 3.0854342          B
## 17        NA          S
## 18 0.7072426          O
## 19 3.8622090          I
## 20        NA          Q
## 21 7.0761349          F
## 22 2.6729233          R
## 23 6.5265862          G
## 24 2.1404349          Z
## 25 8.8508021          T
## 26        NA          A

For the first variable, write a single line of R code to identify which rows have the missing values.

which(!complete.cases(data_frame))

## [1] 13 17 20 26

Re-order the entire data frame to arrange the second variable in alphabetical order

data_frame <- data_frame[order(data_frame$my_letters),]
print(data_frame)

##      my_unis my_letters
## 26        NA          A
## 16 3.0854342          B
## 12 5.2723138          C
## 11 8.9918071          D
## 4  5.6226966          E
## 21 7.0761349          F
## 23 6.5265862          G
## 7  3.2006808          H
## 19 3.8622090          I
## 15 8.2470004          J
## 1  7.8547594          K
## 5  8.8405213          L
## 10 4.9680235          M
## 13        NA          N
## 18 0.7072426          O
## 6  6.8492564          P
## 20        NA          Q
## 22 2.6729233          R
## 17        NA          S
## 25 8.8508021          T
## 3  5.3713267          U
## 9  3.5056613          V
## 2  0.9796932          W
## 8  8.6874164          X
## 14 6.3615537          Y
## 24 2.1404349          Z

Calculate the column mean for the first variable.

mean(data_frame$my_unis, na.rm=TRUE)

## [1] 5.439749

Homework 5

Audrey Commerford

2025-02-12