Home > other >  How to unnest a dataframe resulting from a test in R
How to unnest a dataframe resulting from a test in R

Time:01-04

I want to unnest the res2dataframe but I got the following error: x[[1]] must be a vector, not a object. What should I do?

library(tidyverse)

df <- tibble(id = c(1,2,3,4,5,6,7,8),
             test = c(2,1,2,1,2,1,2,1),
             difference = c(1,3,4,7,3,9,3,7))

shap <- function(t) {
  df %>% filter(test == t)
  return(shapiro.test(df$difference))
}

test <- 1:2

res <- map(test, function(x) shap(x))
res2 <- enframe(res)
res3 <- unnest(res2, cols = c(value))
#> Error in `list_sizes()`:
#> ! `x[[1]]` must be a vector, not a <htest> object.

#> Backtrace:
#>     ▆
#>  1. ├─tidyr::unnest(res2, cols = c(value))
#>  2. ├─tidyr:::unnest.data.frame(res2, cols = c(value))
#>  3. │ └─tidyr::unchop(data, any_of(cols), keep_empty = keep_empty, ptype = ptype)
#>  4. │   └─tidyr:::df_unchop(cols, ptype = ptype, keep_empty = keep_empty)
#>  5. │     └─tidyr:::list_init_empty(x = col, null = TRUE, typed = keep_empty)
#>  6. │       └─vctrs::list_sizes(x)
#>  7. └─vctrs:::stop_scalar_type(`<fn>`(`<htest>`), "x[[1]]", `<env>`)
#>  8.   └─vctrs:::stop_vctrs(...)
#>  9.     └─rlang::abort(message, class = c(class, "vctrs_error"), ..., call = vctrs_error_call(call))

CodePudding user response:

It is a list output which we may convert to tidy dataset

library(broom)
library(dplyr)
library(tidyr)
df %>% 
  group_by(test) %>%
  summarise(out = tidy(shapiro.test(difference))) %>% 
  unnest(out)

-output

# A tibble: 2 × 4
   test statistic p.value method                     
  <dbl>     <dbl>   <dbl> <chr>                      
1     1     0.895   0.406 Shapiro-Wilk normality test
2     2     0.895   0.406 Shapiro-Wilk normality test

CodePudding user response:

What do you need to extract from each test result? If you look at res2$value[[1]], you'll see that there are a few statistics you can use:

dput(res2$value[[1]])
# structure(list(statistic = c(W = 0.907866418594146), p.value = 0.339283842099497, 
#     method = "Shapiro-Wilk normality test", data.name = "df$difference"), class = "htest")

With this, let's add the $statistic and $p.value:

res2 %>%
  bind_cols(map_dfr(res2$value, ~ .[c("statistic", "p.value")]))
# # A tibble: 2 × 4
#    name value   statistic p.value
#   <int> <list>      <dbl>   <dbl>
# 1     1 <htest>     0.908   0.339
# 2     2 <htest>     0.908   0.339
  • Related