I'm trying to run a 12 linear regression and want to correct for multiple testing problems.
The significance level in my field is usually p = 0.05 With the Bonferroni correction it would be p = 0.05 / 12 = 0.0041
If I run the regression as
fit <- lm(y ~ x1 x2, data = mydata)
I get the significance levels 0.0001
, 0.001
, 0.05
, and 0.1
and the related signifcance codes ***
, **
, *
, and .
.
Is there a way to define a new significance level (p_sig = 0.05/12
) and then assign a new code (e.g. *.*
) to that? So that when I re-run the regression this will be used as a new significance level?
Thanks in advance for any help on this!
I know how to do this after running the regression.
fit <- lm(y ~ x1 x2, data = mydata)
results <- summary(fit)
significant <- results$p.value < 0.004
But I'd want to have the "nice" table output.
CodePudding user response:
I'd suggest to run individual models, then apply Bonferroni correction and create the table:
set.seed(123)
pvalues <- vector()
for(i in 1:12){
x <- 1:20
y <- c(1:20) rnorm(20,sd = 10)
fit <- lm(y ~x)
pvalues[i] <- anova(fit)$'Pr(>F)'[1]
# If you wanna check regressions:
# plot(x,y)
# abline(fit)
}
# without Bonferroni correction
pvalues
# [1] 4.304367e-02 2.657028e-05 5.380535e-03 2.170672e-02 3.724384e-03 2.027253e-03
# [7] 3.031334e-02 1.524716e-03 7.847783e-02 4.582751e-02 5.615904e-01 4.654844e-02
pvalues < 0.05
# [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE TRUE FALSE TRUE
# with Bonferroni correction
p.adjust(pvalues, method = 'bonferroni', n = 12)
# [1] 0.5165240276 0.0003188434 0.0645664243 0.2604806671 0.0446926024 0.0243270335
# [7] 0.3637601361 0.0182965929 0.9417339387 0.5499300671 1.0000000000 0.5585812930
p.adjust(pvalues, method = 'bonferroni', n = 12) < 0.05
# [1] FALSE TRUE FALSE FALSE TRUE TRUE FALSE TRUE FALSE FALSE FALSE FALSE
bon = p.adjust(pvalues, method = 'bonferroni', n = 12)
# Add significance levels
levelvalues <- rep('', length = 12)
levelvalues[bon < 0.1] <- "."
levelvalues[bon < 0.05] <- "*"
levelvalues[bon < 0.01] <- "**"
levelvalues[bon < 0.001] <- "***"
levelvalues[is.na(bon)] <- "n.s."
# create table
table <- data.frame(Model = 1:12, `p value` = paste0(scales::scientific(bon, digits = 3),levelvalues))
print(table, row.names = F)
# Model p.value
# 1 5.17e-01
# 2 3.19e-04***
# 3 6.46e-02.
# 4 2.60e-01
# 5 4.47e-02*
# 6 2.43e-02*
# 7 3.64e-01
# 8 1.83e-02*
# 9 9.42e-01
# 10 5.50e-01
# 11 1.00e 00
# 12 5.59e-01
You can also apply other kinds of corrections such as Holm and Benjamini-Hochberg
CodePudding user response:
This is produced by symnum
so modify that using trace
. Change the trace
line as needed.
set.seed(123)
x <- 1:10
y <- 1 2*x rnorm(10)
trace(symnum, quote({ cutpoints=c(0, 0.05/12, 0.1, 1); symbols=c("*", ".", "") }),
print = FALSE)
summary(lm(y ~ x))
## ...
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.5255 0.6673 2.286 0.0516 .
## x 1.9180 0.1075 17.835 1e-07 *
## ---
## Signif. codes: 0 ‘*’ 0.004166667 ‘.’ 0.1 ‘’ 1
## ...
untrace(symnum)