So I am trying to perform a wilcoxon-rank-sum-test on a grouped dataframe. The variables "Feuchte"(=numeric) and "Transtyp" (=factor) shall be tested for each group (soll) I would like to have a datframe that includes the p-value for each group as an output. My df looks like this:
BF_all_soll <- structure(list(Datum = structure(c(18758, 18758, 18758, 18758,
18758, 18758, 18758, 18758, 18758, 18758, 18758, 18758, 18758,
18758, 18758, 18758), class = "Date"), Soll = c("1189", "1189",
"119", "119", "1192", "1192", "1202", "1202", "149", "149", "172",
"172", "2484", "2484", "552", "552"), Transtyp = structure(c(1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("2",
"5"), class = "factor"), Feuchte = c(11.9171875, 14.078125, 10.7153846153846,
10.6387096774194, 13.675, 13.7896551724138, 18.5, 17.071875,
12.390625, 9.690625, 12.3935483870968, 11.6, 10.578125, 10.21875,
13.021875, 13.225), kumsum = c(25.04, 25.04, 20.77, 20.77, 25.04,
25.04, 25.04, 25.04, 20.77, 20.77, 20.77, 20.77, 25.04, 25.04,
25.04, 25.04)), row.names = c(NA, -16L), groups = structure(list(
Soll = c("1189", "1189", "119", "119", "1192", "1192", "1202",
"1202", "149", "149", "172", "172", "2484", "2484", "552",
"552"), Transtyp = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("2", "5"), class = "factor"),
.rows = structure(list(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L,
10L, 11L, 12L, 13L, 14L, 15L, 16L), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), row.names = c(NA, -16L), class = c("tbl_df",
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"))
The code I have written so far is this
BF_all_soll %>% split( BF_all_soll$Soll) %>%
map( ~wilcox.test(Feuchte ~ Transtyp, data = BF_all_soll))%>%
map_dfr(~ broom::tidy(.)) ->bla
However, the output cant be right. The p-values are all the same. What am I missing? Any help is really appreciated!
Cheers
CodePudding user response:
That is because you are using the same data (BF_all_soll
) in wilcox.test
. To use data specific to each group use .x
in map
.
library(dplyr)
library(purrr)
BF_all_soll %>%
ungroup() %>%
split(.$Soll) %>%
map_df( ~broom::tidy(wilcox.test(Feuchte ~ Transtyp, data = .x))) -> bla
This again gives the same p-value on the data shared but should give you correct p-values on bigger data.