Home > Software design >  R chisq.test - automate through a df, each variable with each, df has data, not counts, wide form
R chisq.test - automate through a df, each variable with each, df has data, not counts, wide form

Time:10-28

I have a data-frame that looks like this:

    gend     domh   nat le re lf rf ad ab imp
1    f         R    fr  y  y  n  y  y  y   Y
2    f         R    fr  n  n  y  n  n  n   N
3    f         R    fr              y  y   Y
4    f         R    fr  n  n  n  n  n  n   N
5    m         L    fr  y  n  n  y  y  y   Y
6    m         R    fr  y  y  y  y  y  y   Y
7    m         R    fr  y  y  y  y  y  y   Y
8    f         L    fr  y  y  n  n  n  y   N
9    f         R    fr  n  n  n  n  n  y   N
10   m         R    fr  y  y  y  y  y  y   Y
11   f         R    fr  y  y  n  n  y  y   Y
12   m         R   pfr                 n   N
13   f         R   pfr  y  y  n  n  n  y   N
14   m         R   pfr  y  n  n  n  y  y   N
15   f         R   pfr  y  n  y  n  y  y   Y
16   f         R   pfr  y  n  y  y  y  y   Y
17   m         L   pfr  n  n  y  y  y  y   Y
18   m         R   pfr  y  y  y  y  y  y   Y
19   m         R   pfr  y  n  y  y  y  y   Y
20   f         R   pfr  y  y  y  y  y  y   Y
21   f         R   pfr                 n   N
22   f         R   pfr  y  y  y  y  y  y   Y

I would like each variable to be "chi-squared" with each, i.e. gender with domh, nat, le, etc., then domh with each of the others, etc., etc. I know how to do this by hand:

>chisq.test(df$gend, df$domh, simulate.p.value = T, B = 1000000)

but there just has to be a way to automate this. I only need the p-values (and the pair-name) out of the tests. Could someone help, please?

CodePudding user response:

The function colpair_map from corrr package seems to provide one way as you need just the p-value. (Looping over the columns is the obvious alternative, of course).

# you first need a function that just returns p-value
# to be used in colpair_map. The <htest> object returned
# by chisq.test doesn't work with colpair_map
chisq_pval <- function(...) {
    chisq.test(...)$p.value
}

# the chisq_pval function defined above can now be used directly
# in colpair_map, along with the additional arguments for chisq.test
corrr::colpair_map(df, chisq_pval, simulate.p.value = T, B = 1000000)
  • Related