Basic R

12. Now it’s your turn

Task A

  1. Create a numeric vector called participants, containing integer numbers from 1 to 20, using c() and seq().
  2. Create a character vector called conditions, of length 20, containing alternating values of “a” and “b” (“a”, “b”, “a”, “b”, “a”, …), using rep().
  3. Create a vector called first_half, containing only the first half of the participants vector’s values.
  4. Check that both participants and conditions have length 20, and that first_half has length 10, using length().
  5. Instead of the fifth element in conditions, insert a missing value. Print the conditions after the change.
  6. Create a vector called participants_cond, by pasting together (using paste()) participants and conditions, separated by _. So, the first element should be “1_a”.
  7. Check that the 5th element of participants_cond is still a missing value.
# task A1
participants = c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20)
participants = seq(from=1, to=20)

# task A2
x = c("a", "b")
conditions = rep(x, times=10, each=1)

# task A3
first_half = participants[1:(length(participants)/2)]
print(first_half)
##  [1]  1  2  3  4  5  6  7  8  9 10
# task A4
length(participants) == 20 & length(conditions) == 20
## [1] TRUE
length(first_half) == 10
## [1] TRUE
# task A5
conditions[5] = NA
print(conditions)
##  [1] "a" "b" "a" "b" NA  "b" "a" "b" "a" "b" "a" "b" "a" "b" "a" "b" "a" "b" "a" "b"
# task A6
participants_cond = paste(participants, conditions, sep="_")
print(participants_cond)
##  [1] "1_a"  "2_b"  "3_a"  "4_b"  "5_NA" "6_b"  "7_a"  "8_b"  "9_a"  "10_b" "11_a" "12_b" "13_a" "14_b" "15_a" "16_b"
## [17] "17_a" "18_b" "19_a" "20_b"
# task A7
is.na(participants_cond[5])
## [1] FALSE
participants_cond[5] = NA
print(participants_cond)
##  [1] "1_a"  "2_b"  "3_a"  "4_b"  NA     "6_b"  "7_a"  "8_b"  "9_a"  "10_b" "11_a" "12_b" "13_a" "14_b" "15_a" "16_b"
## [17] "17_a" "18_b" "19_a" "20_b"

Task B

  1. Create a data frame called my_data, with as columns the vectorsparticipants, conditions and participants_cond that you created before. Print the data frame to have a look if it worked.
  2. Add a column called response_times made of 20 samples from the normal distribution, with mean .8 and standard deviation 1. Print the data frame to have a look if it worked.
  3. Select the values of the response_times column that are negative and set them to 0. Print the data frame to have a look if it worked.
  4. Create a new column, called log_response_times, made of the logarithm of response_times.
  5. Add a column called correct_response made of 20 samples from the binomial distribution, with size 1 and probability of success .65. Print the data frame to have a look if it worked.
  6. Calculate the mean proportion of correct responses and the mean response time.
  7. Create two data frames, data_correct and data_incorrect made of, respectively, the subset of my_data where correct_response is 1, and the subset of my_data where correct_response is 0. Print the data frame to have a look if it worked. Print the result to check.
# task B1
my_data = data.frame(participants, conditions, participants_cond)
print(my_data)
##    participants conditions participants_cond
## 1             1          a               1_a
## 2             2          b               2_b
## 3             3          a               3_a
## 4             4          b               4_b
## 5             5       <NA>              <NA>
## 6             6          b               6_b
## 7             7          a               7_a
## 8             8          b               8_b
## 9             9          a               9_a
## 10           10          b              10_b
## 11           11          a              11_a
## 12           12          b              12_b
## 13           13          a              13_a
## 14           14          b              14_b
## 15           15          a              15_a
## 16           16          b              16_b
## 17           17          a              17_a
## 18           18          b              18_b
## 19           19          a              19_a
## 20           20          b              20_b
# task B2
my_data$response_times = rnorm(n=20, mean=.8, sd=1)
print(my_data)
##    participants conditions participants_cond response_times
## 1             1          a               1_a     0.67497731
## 2             2          b               2_b     1.11035090
## 3             3          a               3_a     1.01776449
## 4             4          b               4_b    -1.13302187
## 5             5       <NA>              <NA>    -0.49996216
## 6             6          b               6_b     0.70169411
## 7             7          a               7_a     1.05788834
## 8             8          b               8_b    -0.02429025
## 9             9          a               9_a     2.53628578
## 10           10          b              10_b     0.16849823
## 11           11          a              11_a     1.19261085
## 12           12          b              12_b    -0.84618119
## 13           13          a              13_a     1.37031098
## 14           14          b              14_b    -0.68834881
## 15           15          a              15_a     1.71053945
## 16           16          b              16_b     3.06906284
## 17           17          a              17_a     0.32712329
## 18           18          b              18_b    -0.24271330
## 19           19          a              19_a     1.47521986
## 20           20          b              20_b     0.53879215
# task B3
my_data[my_data$response_times < 0, "response_times"] = 0
print(my_data)
##    participants conditions participants_cond response_times
## 1             1          a               1_a      0.6749773
## 2             2          b               2_b      1.1103509
## 3             3          a               3_a      1.0177645
## 4             4          b               4_b      0.0000000
## 5             5       <NA>              <NA>      0.0000000
## 6             6          b               6_b      0.7016941
## 7             7          a               7_a      1.0578883
## 8             8          b               8_b      0.0000000
## 9             9          a               9_a      2.5362858
## 10           10          b              10_b      0.1684982
## 11           11          a              11_a      1.1926109
## 12           12          b              12_b      0.0000000
## 13           13          a              13_a      1.3703110
## 14           14          b              14_b      0.0000000
## 15           15          a              15_a      1.7105394
## 16           16          b              16_b      3.0690628
## 17           17          a              17_a      0.3271233
## 18           18          b              18_b      0.0000000
## 19           19          a              19_a      1.4752199
## 20           20          b              20_b      0.5387921
# task B4
my_data$log_response_times = log(my_data$response_times)
print(my_data)
##    participants conditions participants_cond response_times log_response_times
## 1             1          a               1_a      0.6749773        -0.39307620
## 2             2          b               2_b      1.1103509         0.10467609
## 3             3          a               3_a      1.0177645         0.01760855
## 4             4          b               4_b      0.0000000               -Inf
## 5             5       <NA>              <NA>      0.0000000               -Inf
## 6             6          b               6_b      0.7016941        -0.35425771
## 7             7          a               7_a      1.0578883         0.05627479
## 8             8          b               8_b      0.0000000               -Inf
## 9             9          a               9_a      2.5362858         0.93070072
## 10           10          b              10_b      0.1684982        -1.78083004
## 11           11          a              11_a      1.1926109         0.17614490
## 12           12          b              12_b      0.0000000               -Inf
## 13           13          a              13_a      1.3703110         0.31503771
## 14           14          b              14_b      0.0000000               -Inf
## 15           15          a              15_a      1.7105394         0.53680879
## 16           16          b              16_b      3.0690628         1.12137225
## 17           17          a              17_a      0.3271233        -1.11741814
## 18           18          b              18_b      0.0000000               -Inf
## 19           19          a              19_a      1.4752199         0.38880704
## 20           20          b              20_b      0.5387921        -0.61842541
# task B5
my_data$correct_response = rbinom(n=20, size=1, prob=.65)
print(my_data)
##    participants conditions participants_cond response_times log_response_times correct_response
## 1             1          a               1_a      0.6749773        -0.39307620                1
## 2             2          b               2_b      1.1103509         0.10467609                1
## 3             3          a               3_a      1.0177645         0.01760855                1
## 4             4          b               4_b      0.0000000               -Inf                0
## 5             5       <NA>              <NA>      0.0000000               -Inf                1
## 6             6          b               6_b      0.7016941        -0.35425771                1
## 7             7          a               7_a      1.0578883         0.05627479                1
## 8             8          b               8_b      0.0000000               -Inf                0
## 9             9          a               9_a      2.5362858         0.93070072                0
## 10           10          b              10_b      0.1684982        -1.78083004                0
## 11           11          a              11_a      1.1926109         0.17614490                1
## 12           12          b              12_b      0.0000000               -Inf                1
## 13           13          a              13_a      1.3703110         0.31503771                1
## 14           14          b              14_b      0.0000000               -Inf                1
## 15           15          a              15_a      1.7105394         0.53680879                1
## 16           16          b              16_b      3.0690628         1.12137225                1
## 17           17          a              17_a      0.3271233        -1.11741814                1
## 18           18          b              18_b      0.0000000               -Inf                0
## 19           19          a              19_a      1.4752199         0.38880704                1
## 20           20          b              20_b      0.5387921        -0.61842541                1
# task B6
mean(my_data$response_times)
## [1] 0.8475559
mean(my_data$correct_response)
## [1] 0.75
# task B7
data_correct = my_data[my_data$correct_response == 1,]
data_incorrect = my_data[my_data$correct_response == 0,]
print(data_correct)
##    participants conditions participants_cond response_times log_response_times correct_response
## 1             1          a               1_a      0.6749773        -0.39307620                1
## 2             2          b               2_b      1.1103509         0.10467609                1
## 3             3          a               3_a      1.0177645         0.01760855                1
## 5             5       <NA>              <NA>      0.0000000               -Inf                1
## 6             6          b               6_b      0.7016941        -0.35425771                1
## 7             7          a               7_a      1.0578883         0.05627479                1
## 11           11          a              11_a      1.1926109         0.17614490                1
## 12           12          b              12_b      0.0000000               -Inf                1
## 13           13          a              13_a      1.3703110         0.31503771                1
## 14           14          b              14_b      0.0000000               -Inf                1
## 15           15          a              15_a      1.7105394         0.53680879                1
## 16           16          b              16_b      3.0690628         1.12137225                1
## 17           17          a              17_a      0.3271233        -1.11741814                1
## 19           19          a              19_a      1.4752199         0.38880704                1
## 20           20          b              20_b      0.5387921        -0.61842541                1
print(data_incorrect)
##    participants conditions participants_cond response_times log_response_times correct_response
## 4             4          b               4_b      0.0000000               -Inf                0
## 8             8          b               8_b      0.0000000               -Inf                0
## 9             9          a               9_a      2.5362858          0.9307007                0
## 10           10          b              10_b      0.1684982         -1.7808300                0
## 18           18          b              18_b      0.0000000               -Inf                0