I want to analyze students' responses in Rstudio. I have an answers dataframe for 100 students and 50 questions, and a dataframe for correct answers. I used score.multiple.choice in psych package...but due to my data I got a message "Error in max.item - min.item : non-numeric argument to binary operator"
Student.tf <- score.multiple.choice(key_A, sudent_all,score=FALSE)
Error in max.item - min.item : non-numeric argument to binary operator
is it possible to hanle with this problem
**STUDENT ANSWER**
|id |q1 |q2 |q3 |q4|...|q50
|1 | A |B |C |D | | E
|2 | B |B |A |E | | E
.
.
.
|99 | A |B |C |D | |E
|100 | A |B |C |D | |E
**KEY**
|id |q1|q2| q3| q4...|q50
|1 |A | B| C| D | E
CodePudding user response:
There are a few flaws in your reasoning. First, your data is of type char and the function score.multiple.choice
requires numeric values. Second, as the name itself will select the score.multiple.choice
vector mesh with multiple choice. The whole thing can be solved without resorting to the psyche
package. Look below.
For starters, I simulated the relevant data.
library(tidyverse)
nid = 100
nq = 50
sudent_all = tibble(
id = rep(1:nid, nq),
nqest = rep(paste0("q", 1:nq), each = nid),
answ = sample(c("A", "B", "C", "D", "E"), nid*nq, replace = TRUE)
) %>% mutate(answ = answ %>% fct_inorder()) %>%
pivot_wider(id, names_from = nqest, values_from = answ)
key_A = sample(c("A", "B", "C", "D", "E"), nq, replace = TRUE) %>% fct_inorder()
output sudent_all
# A tibble: 100 x 51
id q1 q2 q3 q4 q5 q6 q7 q8 q9 q10 q11 q12 q13 q14 q15 q16 q17 q18 q19 q20
<int> <fct> <fct> <fct> <fct> <fct> <fct> <fct> <fct> <fct> <fct> <fct> <fct> <fct> <fct> <fct> <fct> <fct> <fct> <fct> <fct>
1 1 E A B A C E C E C E B D C E A D C C E E
2 2 B A E E A D A E B D C E D B A E E D B B
3 3 E D A B A A E A D C B D E E B D D E A A
4 4 B B B D B E B B C D D A E D E B E D E C
5 5 E C E E D A C E B B B A C A E A D A C E
6 6 B D A E E A D D C C C C A D E E E B A B
7 7 E A D E E D A C E D C C D E C C A C D E
8 8 B A E D D D D A A A E D D A A C E C C B
9 9 C B B E B B E A E E D A B D D E B C E D
10 10 C C B D E A A D A E E C B D A C B E C E
# ... with 90 more rows, and 30 more variables: q21 <fct>, q22 <fct>, q23 <fct>, q24 <fct>, q25 <fct>, q26 <fct>, q27 <fct>,
# q28 <fct>, q29 <fct>, q30 <fct>, q31 <fct>, q32 <fct>, q33 <fct>, q34 <fct>, q35 <fct>, q36 <fct>, q37 <fct>, q38 <fct>,
# q39 <fct>, q40 <fct>, q41 <fct>, q42 <fct>, q43 <fct>, q44 <fct>, q45 <fct>, q46 <fct>, q47 <fct>, q48 <fct>, q49 <fct>,
# q50 <fct>
output key_A
key_A
[1] A E E B E B B E A B C B B C C E D D D C B C A E B E B C E B D B A D C E A A E B E D A A B D C C A A
Levels: A E B C D
Now all you need is two simple functions and one mutation.
fCheck = function(data, key) data %>%
pivot_longer(everything()) %>%
mutate(value = value==key) %>%
pivot_wider()
fSumCorr = function(data) data %>%
pivot_longer(everything()) %>% pull(value) %>% sum()
sudent_all %>% group_by(id) %>%
nest() %>%
mutate(data = map(data, ~fCheck(.x, key_A))) %>%
mutate(sumCorr = map(data, ~fSumCorr(.x))) %>%
unnest(sumCorr)
output
# A tibble: 100 x 3
# Groups: id [100]
id data sumCorr
<int> <list> <int>
1 1 <tibble [1 x 50]> 6
2 2 <tibble [1 x 50]> 11
3 3 <tibble [1 x 50]> 9
4 4 <tibble [1 x 50]> 10
5 5 <tibble [1 x 50]> 12
6 6 <tibble [1 x 50]> 8
7 7 <tibble [1 x 50]> 11
8 8 <tibble [1 x 50]> 10
9 9 <tibble [1 x 50]> 12
10 10 <tibble [1 x 50]> 8
# ... with 90 more rows
Note that using sudent_all%>% group_by (id)%>% nest ()
, I collapse the responses for each student into a single tibble
# A tibble: 100 x 2
# Groups: id [100]
id data
<int> <list>
1 1 <tibble [1 x 50]>
2 2 <tibble [1 x 50]>
3 3 <tibble [1 x 50]>
4 4 <tibble [1 x 50]>
5 5 <tibble [1 x 50]>
6 6 <tibble [1 x 50]>
7 7 <tibble [1 x 50]>
8 8 <tibble [1 x 50]>
9 9 <tibble [1 x 50]>
10 10 <tibble [1 x 50]>
# ... with 90 more rows
Such a tibble
is easy to further modify using the mutate
and map
functions.
Update 1
I don't know anything about the psyche package. But if conversion is a problem, you can do it this way.
fConv = function(data, key) data %>%
pivot_longer(everything()) %>%
mutate(value = ifelse(value==key, 1, 0)) %>%
pivot_wider()
df = sudent_all %>% group_by(id) %>%
nest() %>%
mutate(data = map(data, ~fConv(.x, key_A)))
df$data[[1]]
output
# A tibble: 1 x 50
q1 q2 q3 q4 q5 q6 q7 q8 q9 q10 q11 q12 q13 q14 q15 q16 q17 q18 q19 q20 q21
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 0 0 0 0 0 0 1 0 0 1 1 1 0 0 0 0 0 0 1 0 0
# ... with 29 more variables: q22 <dbl>, q23 <dbl>, q24 <dbl>, q25 <dbl>, q26 <dbl>, q27 <dbl>, q28 <dbl>, q29 <dbl>, q30 <dbl>,
# q31 <dbl>, q32 <dbl>, q33 <dbl>, q34 <dbl>, q35 <dbl>, q36 <dbl>, q37 <dbl>, q38 <dbl>, q39 <dbl>, q40 <dbl>, q41 <dbl>,
# q42 <dbl>, q43 <dbl>, q44 <dbl>, q45 <dbl>, q46 <dbl>, q47 <dbl>, q48 <dbl>, q49 <dbl>, q50 <dbl>
Update 2
df = sudent_all %>% group_by(id) %>%
nest() %>%
mutate(data = map(data, ~fConv(.x, key_A))) %>%
unnest(data)
output
# A tibble: 100 x 51
# Groups: id [100]
id q1 q2 q3 q4 q5 q6 q7 q8 q9 q10 q11 q12 q13
<int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 0 0 0 0 0 0 1 0 0 1 1 1 0
2 2 0 0 0 0 0 0 0 0 1 0 1 0 0
3 3 0 0 1 0 1 0 0 0 1 0 0 0 0
4 4 0 0 0 0 0 1 0 0 1 0 0 1 0
5 5 1 0 0 0 0 0 0 0 0 0 0 1 0
6 6 0 0 0 1 0 0 0 1 0 0 1 0 0
7 7 0 0 0 0 0 1 0 0 0 0 1 0 0
8 8 1 0 0 1 1 0 0 0 0 1 0 1 0
9 9 0 1 0 0 0 0 0 0 0 0 1 0 1
10 10 0 0 0 0 1 1 1 0 1 0 0 0 0
# ... with 90 more rows, and 37 more variables: q14 <dbl>, q15 <dbl>, q16 <dbl>,
# q17 <dbl>, q18 <dbl>, q19 <dbl>, q20 <dbl>, q21 <dbl>, q22 <dbl>, q23 <dbl>,
# q24 <dbl>, q25 <dbl>, q26 <dbl>, q27 <dbl>, q28 <dbl>, q29 <dbl>, q30 <dbl>,
# q31 <dbl>, q32 <dbl>, q33 <dbl>, q34 <dbl>, q35 <dbl>, q36 <dbl>, q37 <dbl>,
# q38 <dbl>, q39 <dbl>, q40 <dbl>, q41 <dbl>, q42 <dbl>, q43 <dbl>, q44 <dbl>,
# q45 <dbl>, q46 <dbl>, q47 <dbl>, q48 <dbl>, q49 <dbl>, q50 <dbl>