Home > Mobile >  Is there a way to make 2-way tables with different pairs of variable in R?
Is there a way to make 2-way tables with different pairs of variable in R?

Time:09-16

I am cleaning a data of a numeracy test.

Some test items are multiple-choice items, where students choose one of the choices (e.g. a), b), or c)).

In the dataset, I made new variables by converting the items into binary variables. For example, if the correct answer is a) for Item1, I made newItem_1 by recoding a) = 1 and otherwise = 0 (NA is left as it is).

I would like to double check if the re-coding is done successfully by table-ing the original and new variables. Doing this one pair only (in this case Item1 and newItem_1) is easy, but since I have a lot of these multiple-choice items, it's not efficient to write a script to table each pair one by one.

Here's my question: is there any way to make 2-way tables with each pair of these original and new variables? I tried to do this by for loop and looked for tips online, but couldn't find a solution so far.

I extracted part of the dataframe below.

structure(list(ID = 1:20, gender = c("Male", "Male", "Male", 
"Male", "Male", "Male", "Male", "Male", "Male", "Male", "Male", 
"Male", "Male", "Male", "Male", "Female", "Female", "Female", 
"Female", "Female"), Item1 = c("c", "c", "a", "a", NA, "c", "c", 
"b", "b", "b", "c", "c", NA, "c", "a", "d", "c", "c", "c", "c"
), Item2 = c("d", "d", "d", "d", "d", "a", "a", "a", "a", "b", 
"b", "c", "c", "c", "c", "d", NA, NA, "d", "d"), Item3 = c("b", 
"d", NA, "a", NA, "d", "c", "c", NA, "d", "c", NA, NA, "c", "d", 
"c", "d", "d", "d", "d"), new_Item1 = c(1L, 1L, 0L, 0L, NA, 1L, 
1L, 0L, 0L, 0L, 1L, 1L, NA, 1L, 0L, 0L, 1L, 1L, 1L, 1L), new_Item2 = c(1L, 
1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, NA, 
NA, 1L, 1L), new_Item3 = c(0L, 0L, NA, 0L, NA, 0L, 1L, 1L, NA, 
0L, 1L, NA, NA, 1L, 0L, 1L, 0L, 0L, 0L, 0L)), class = "data.frame", row.names = c(NA, 
-20L))

Many thanks in advance.

Shun

For a pair, I just type: library(janitor) tabyl (g3, Item1, new_Item1) and I can see my recoding is correct. But I want to loop the same tabulation through Item1, 2 and 3 (and more) in this case. So my expected output would be something like (if I use tabyl):
-------------------
Item1 1 0 NA
a # # #
b # # #
c # # #
d # # #
NA # # #

Item2 1 0 NA
a # # #
b # # #
c # # #
d # # #
.....
----------------------
I hope my explanation is clear.

CodePudding user response:

Is there a reason you don't want to use the base r table function? It looks like you would get what you want from:

table(g3$Item1, g3$new_Item1, useNA="always")

Where g3 is the dataframe you defined above.

If you want to define the pairs a different way for a loop, I suggest something like:

x = "Item1"
table(g3[, colnames(g3)==x], g3[, colnames(g3)==paste0("new_",x)], useNA="always")

Where x is your loop variable. You can compare "x" vs "new_x" this way without manually pairing each column in the table function. You just need to feed in a list for x into your loop.

The output is:

        0  1 <NA>
  a     3  0    0
  b     3  0    0
  c     0 11    0
  d     1  0    0
  <NA>  0  0    2

CodePudding user response:

You can get the column names in a variable and use Map to loop over each pair and return the comparison table.

library(janitor)
x <- grep('^Item\\d $', names(df), value = TRUE)
y <- grep('^new_Item\\d $', names(df), value = TRUE)

Map(function(p, q) tabyl(df, .data[[p]], .data[[q]]), x, y)

#$Item1
# Item1 0  1 NA_
#     a 3  0   0
#     b 3  0   0
#     c 0 11   0
#     d 1  0   0
#  <NA> 0  0   2

#$Item2
# Item2 0 1 NA_
#     a 4 0   0
#     b 2 0   0
#     c 4 0   0
#     d 0 8   0
#  <NA> 0 0   2

#$Item3
# Item3 0 1 NA_
#     a 1 0   0
#     b 1 0   0
#     c 0 5   0
#     d 8 0   0
#  <NA> 0 0   5
  • Related