How to edit data name in output when using lapply for fisher's exact test in R?-CodePudding

I'm trying to run a series of Fisher's Exact Tests on my data using lapply, and want to know if there's a way to specify the data name in the output.

This is my current code:

vars <- c('xvar1', 'xvar2', 'xvar3')
lapply(vars, function(i) {fisher.test(dataframe[[i]], dataframe$yvar)})

This produces the output I'm looking for, but the data name in the output is "dataframe[[i]] and dataframe$yvar". Is there a way to make that say "dataframe$xvar1 and dataframe$yvar" instead?

I know that when running chi squares with ctables you can do it using the dnn argument (sample below) but haven't found an equivalent argument for fisher's test.

ctables <- lapply(vars, function(i) {ctable(dataframe[[i]], dataframe$yvar, chisq=T, useNA='no', dnn=c(i,'yvar'))})

Thanks!

CodePudding user response：

I'm not sure exactly what your goal is here. Most likely, you want to extract the results (ie. the p-values) from the fisher test result object and work with them as a list or data frame.

To do this, take a look at the help: ?fisher.test. If you scroll down to the Value section, you'll see that it actually returns a list. While it has a special class which makes it print out in a special way, it's still a list and we can access list elements by name like in any other list.

x <- fisher.test(iris$Petal.Length, iris$Sepal.Width, simulate.p.value = T)
x$p.value
[1] 0.04797601

In your case, if you just want to get a list of the p.values, you'd just extract that from the object in the apply function:

# To get a list of p.values
lapply(vars, function(i) {
    res <- fisher.test(dataframe[[i]], dataframe$yvar)
    return(res$p.value)
})

# To get a full data frame
tbl <- lapply(vars, function(i) {
    res <- fisher.test(dataframe[[i]], dataframe$yvar)
    df <- data.frame(var = i,
                     p = res$p.value)
    return(df)
})

# Convert to data.frame in baseR
as.data.frame(t(sapply(tbl, rbind)))

# Convert to data.frame in tidyverse
dplyr::bind_rows(tbl)

If you really want to keep the default output, just with a corrected data value shown, you can manually edit the result object:

x <- fisher.test(iris$Petal.Length, iris$Sepal.Width, simulate.p.value = T)
x

    Fisher's Exact Test for Count Data with simulated p-value
    (based on 2000 replicates)

data:  iris$Petal.Length and iris$Sepal.Width
p-value = 0.04798
alternative hypothesis: two.sided

As you can see, the data value is based on what we put into the test. We can edit that to whatever we want, though, and the it will show that output in the default printing:

x$data.name <- 'cat'
x

    Fisher's Exact Test for Count Data with simulated p-value
    (based on 2000 replicates)

data:  cat
p-value = 0.04798
alternative hypothesis: two.sided

In your specific example, your code would look something like this:

lapply(vars, function(i) {
    res <- fisher.test(dataframe[[i]], dataframe$yvar)
    res$data.name <- paste0("dataframe$", i, " and dataframe$yvar")
    return(res)
})