it's quite a basic question I think but I can't figure out how to do it in a few elegant steps. I have this dataset:
df <- data.frame(A=c(1,2,2,3,4,5,1,1,2,3),
B=c(4,4,2,3,4,2,1,5,2,2),
C=c(3,3,3,3,4,2,5,1,2,3),
D=c(1,2,5,5,5,4,5,5,2,3),
E=c(1,4,2,3,4,2,5,1,2,3),
dummy1=c("yes","yes","no","no","no","no","yes","no","yes","yes"),
dummy2=c("high","low","low","low","high","high","high","low","low","high"))
df1 <- data.frame(lapply(df1, factor))
And I would like to make a grouped barplot in ggplot
that only considers the "1" (basically plotting the frequencies of the 1s) by dummy. On the X-axis (the groups) there should be the columns (A,B,C,D,E), on the Y-axis the % of "1" in the respective column and the colors of the bars reflect the dummy considered.
This is basically what I want:
I know how to do this by creating each time a single dataframe for each dummy level and then plot them, but I'm sure there's a better and more efficient way to do this.
Thanks in advance for any suggestion!
CodePudding user response:
Something like the following could work:
library(data.table)
setDT(df1)
graph_data <- df1[, lapply(.SD, function(x) sum( x == 1 )/ nrow(df1)),
by = "dummy1",
.SDcols = c("A","B","C","D","E")] %>%
melt(id.vars = "dummy1")
library(ggplot2)
ggplot()
geom_col(data = graph_data,
mapping = aes(x = variable, y = value, fill = dummy1),
position = "dodge")