I am cleaning a data of a numeracy test.
Some test items are multiple-choice items, where students choose one of the choices (e.g. a)
, b)
, or c)
).
In the dataset, I made new variables by converting the items into binary variables.
For example, if the correct answer is
a)
for Item1, I made newItem_1
by recoding a) = 1
and otherwise = 0
(NA
is left as it is).
I would like to double check if the re-coding is done successfully by table-ing the original and new variables. Doing this one pair only (in this case Item1
and newItem_1
) is easy, but since I have a lot of these multiple-choice items, it's not efficient to write a script to table each pair one by one.
Here's my question: is there any way to make 2-way tables with each pair of these original and new variables? I tried to do this by for loop and looked for tips online, but couldn't find a solution so far.
I extracted part of the dataframe below.
structure(list(ID = 1:20, gender = c("Male", "Male", "Male",
"Male", "Male", "Male", "Male", "Male", "Male", "Male", "Male",
"Male", "Male", "Male", "Male", "Female", "Female", "Female",
"Female", "Female"), Item1 = c("c", "c", "a", "a", NA, "c", "c",
"b", "b", "b", "c", "c", NA, "c", "a", "d", "c", "c", "c", "c"
), Item2 = c("d", "d", "d", "d", "d", "a", "a", "a", "a", "b",
"b", "c", "c", "c", "c", "d", NA, NA, "d", "d"), Item3 = c("b",
"d", NA, "a", NA, "d", "c", "c", NA, "d", "c", NA, NA, "c", "d",
"c", "d", "d", "d", "d"), new_Item1 = c(1L, 1L, 0L, 0L, NA, 1L,
1L, 0L, 0L, 0L, 1L, 1L, NA, 1L, 0L, 0L, 1L, 1L, 1L, 1L), new_Item2 = c(1L,
1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, NA,
NA, 1L, 1L), new_Item3 = c(0L, 0L, NA, 0L, NA, 0L, 1L, 1L, NA,
0L, 1L, NA, NA, 1L, 0L, 1L, 0L, 0L, 0L, 0L)), class = "data.frame", row.names = c(NA,
-20L))
Many thanks in advance.
Shun
For a pair, I just type:
library(janitor)
tabyl (g3, Item1, new_Item1)
and I can see my recoding is correct. But I want to loop the same tabulation through Item1, 2 and 3 (and more) in this case. So my expected output would be something like (if I use tabyl):
-------------------
Item1 1 0 NA
a # # #
b # # #
c # # #
d # # #
NA # # #
Item2 1 0 NA
a # # #
b # # #
c # # #
d # # #
.....
----------------------
I hope my explanation is clear.
CodePudding user response:
Is there a reason you don't want to use the base r table function? It looks like you would get what you want from:
table(g3$Item1, g3$new_Item1, useNA="always")
Where g3 is the dataframe you defined above.
If you want to define the pairs a different way for a loop, I suggest something like:
x = "Item1"
table(g3[, colnames(g3)==x], g3[, colnames(g3)==paste0("new_",x)], useNA="always")
Where x is your loop variable. You can compare "x" vs "new_x" this way without manually pairing each column in the table function. You just need to feed in a list for x into your loop.
The output is:
0 1 <NA>
a 3 0 0
b 3 0 0
c 0 11 0
d 1 0 0
<NA> 0 0 2
CodePudding user response:
You can get the column names in a variable and use Map
to loop over each pair and return the comparison table.
library(janitor)
x <- grep('^Item\\d $', names(df), value = TRUE)
y <- grep('^new_Item\\d $', names(df), value = TRUE)
Map(function(p, q) tabyl(df, .data[[p]], .data[[q]]), x, y)
#$Item1
# Item1 0 1 NA_
# a 3 0 0
# b 3 0 0
# c 0 11 0
# d 1 0 0
# <NA> 0 0 2
#$Item2
# Item2 0 1 NA_
# a 4 0 0
# b 2 0 0
# c 4 0 0
# d 0 8 0
# <NA> 0 0 2
#$Item3
# Item3 0 1 NA_
# a 1 0 0
# b 1 0 0
# c 0 5 0
# d 8 0 0
# <NA> 0 0 5