Home > Software design >  Multiple wilcox.tests across columns using variables in first column (R)
Multiple wilcox.tests across columns using variables in first column (R)

Time:09-29

I have this data.frame

 df <- data.frame(
      variable=c(2.4860651, -0.68863024, 2.63530974, -2.95754943, 1.67945091, 2.63530974,
           4.79002539, 2.32575938, 3.57236441, -0.364825998, -2.00646016, -3.12380516, 
           0.69307013, -5.65846824, 0.45632519, 2.08978142),
      A=c(0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0),
      B=c(1, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0),
      C=c(0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1),
      D=c(1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0),
      E=c(0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0),
      F=c(0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1))

I would like to perform wilcox.test for each column with groups defined by 0 and 1 in the columns and using the variables in the column df$variable. Then add the p.values in a new row and adjusted p.values in another row.

I have tried this:

 library(dplyr)
 result <- df %>% summarise(across(!variable, ~wilcox.test(.x ~ variable)$p.value), exact=NULL) %>%
         bind_rows(., p.adjust(., method = 'BH')) %>%
         bind_rows(df, .) %>%
         mutate(variable=replace(variable, is.na(variable), c('p.values', 'p.adjust')))

But this causes errors.

This is the result I would like to get:

 result <- data.frame(
      variable=c(2.4860651, -0.68863024, 2.63530974, -2.95754943, 1.67945091, 2.63530974,
           4.79002539, 2.32575938, 3.57236441, -0.364825998, -2.00646016, -3.12380516, 
           0.69307013, -5.65846824, 0.45632519, 2.08978142, 'p.value', 'p.adjust'),
      A=c(0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1),
      B=c(1, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0.560444274, 1),
      C=c(0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 0.143117298, 0.764253489),
      D=c(1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0.820753088, 1),
      E=c(0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0.95482869, 1),
      F=c(0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0.254751163, 0.764253489))

Can anyone help?

CodePudding user response:

You may try something along the lines of the following -

library(dplyr)

df %>% mutate(variable = as.character(variable)) %>%
  bind_rows(
    df %>% 
      summarise(across(!variable, 
          ~wilcox.test(variable[.x == 0], variable[.x == 1])$p.value)) %>%
      bind_rows((.) %>% 
           summarise(across(.fns = ~p.adjust(.x, method = "BH")))) %>%
      mutate(variable = c('p.values', 'p.adjust'))
  )
  • Related