I have calculated the Anova F-Test p-value for differences in means for several variables. Now I would like to add "stars" that indicate the significance level of the p-value. I would like to have * for significance at at the 10% level, ** at the 5% level and *** at the 1% level.
My data looks like this:
structure(list(Variables = c("A", "B", "C", "D", "E"),
`Anova F-Test p-Value` = c(0.05, 5e-04, 0.5, 0.05, 0.01)),
class = "data.frame", row.names = c(NA, -5L))
Could someone help me with the code here?
CodePudding user response:
You can build your own function. Note however that this is not the conventional star system (it's totally okay if you mention the scale somewhere though). See e.g. here.
stars.pval <- function(x){
stars <- c("***", "**", "*", "n.s.")
var <- c(0, 0.01, 0.05, 0.10, 1)
i <- findInterval(x, var, left.open = T, rightmost.closed = T)
stars[i]
}
transform(dat, stars = stars.pval(dat$`Anova F-Test p-Value`))
Variables Anova.F.Test.p.Value stars
1 A 5e-02 **
2 B 5e-04 ***
3 C 5e-01 n.s.
4 D 5e-02 **
5 E 1e-02 ***
CodePudding user response:
I would suggest to use cut
for this
Edit: notes. Use right = FALSE to define p <= alpha as significant, use right = TRUE for p < alpha to be significant. Also changed 0 and 1 for -Inf and Inf, this often handles boundaries better in cut.
dt$stars <- cut(dt[[2]], breaks = c(-Inf, 0.01, 0.05, 0.10, Inf),
labels = c("***", "**", "*", "n.s."), right = FALSE)
dt
# Variables Anova F-Test p-Value stars
# 1 A 0.0500 *
# 2 B 0.0005 ***
# 3 C 0.5000 n.s.
# 4 D 0.0500 *
# 5 E 0.0100 **