mapping wilcoxon-rank-sum- test on groups-CodePudding

So I am trying to perform a wilcoxon-rank-sum-test on a grouped dataframe. The variables "Feuchte"(=numeric) and "Transtyp" (=factor) shall be tested for each group (soll) I would like to have a datframe that includes the p-value for each group as an output. My df looks like this:

BF_all_soll <- structure(list(Datum = structure(c(18758, 18758, 18758, 18758, 
18758, 18758, 18758, 18758, 18758, 18758, 18758, 18758, 18758, 
18758, 18758, 18758), class = "Date"), Soll = c("1189", "1189", 
"119", "119", "1192", "1192", "1202", "1202", "149", "149", "172", 
"172", "2484", "2484", "552", "552"), Transtyp = structure(c(1L, 
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("2", 
"5"), class = "factor"), Feuchte = c(11.9171875, 14.078125, 10.7153846153846, 
10.6387096774194, 13.675, 13.7896551724138, 18.5, 17.071875, 
12.390625, 9.690625, 12.3935483870968, 11.6, 10.578125, 10.21875, 
13.021875, 13.225), kumsum = c(25.04, 25.04, 20.77, 20.77, 25.04, 
25.04, 25.04, 25.04, 20.77, 20.77, 20.77, 20.77, 25.04, 25.04, 
25.04, 25.04)), row.names = c(NA, -16L), groups = structure(list(
    Soll = c("1189", "1189", "119", "119", "1192", "1192", "1202", 
    "1202", "149", "149", "172", "172", "2484", "2484", "552", 
    "552"), Transtyp = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 
    2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("2", "5"), class = "factor"), 
    .rows = structure(list(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 
        10L, 11L, 12L, 13L, 14L, 15L, 16L), ptype = integer(0), class = c("vctrs_list_of", 
    "vctrs_vctr", "list"))), row.names = c(NA, -16L), class = c("tbl_df", 
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"))

The code I have written so far is this

BF_all_soll %>% split( BF_all_soll$Soll) %>%
                map( ~wilcox.test(Feuchte ~ Transtyp, data = BF_all_soll))%>%
                map_dfr(~ broom::tidy(.)) ->bla

However, the output cant be right. The p-values are all the same. What am I missing? Any help is really appreciated!

Cheers

CodePudding user response：

That is because you are using the same data (BF_all_soll) in wilcox.test. To use data specific to each group use .x in map.

library(dplyr)
library(purrr)

BF_all_soll %>% 
  ungroup() %>%
  split(.$Soll) %>%
  map_df( ~broom::tidy(wilcox.test(Feuchte ~ Transtyp, data = .x))) -> bla

This again gives the same p-value on the data shared but should give you correct p-values on bigger data.