Home > Back-end >  Plot only certain values in entire dataframe
Plot only certain values in entire dataframe

Time:08-18

it's quite a basic question I think but I can't figure out how to do it in a few elegant steps. I have this dataset:

df <- data.frame(A=c(1,2,2,3,4,5,1,1,2,3),
                B=c(4,4,2,3,4,2,1,5,2,2),
                C=c(3,3,3,3,4,2,5,1,2,3),
                D=c(1,2,5,5,5,4,5,5,2,3),
                E=c(1,4,2,3,4,2,5,1,2,3),
                dummy1=c("yes","yes","no","no","no","no","yes","no","yes","yes"),
                dummy2=c("high","low","low","low","high","high","high","low","low","high"))

df1 <- data.frame(lapply(df1, factor))

And I would like to make a grouped barplot in ggplot that only considers the "1" (basically plotting the frequencies of the 1s) by dummy. On the X-axis (the groups) there should be the columns (A,B,C,D,E), on the Y-axis the % of "1" in the respective column and the colors of the bars reflect the dummy considered.

This is basically what I want:

enter image description here

I know how to do this by creating each time a single dataframe for each dummy level and then plot them, but I'm sure there's a better and more efficient way to do this.

Thanks in advance for any suggestion!

CodePudding user response:

Something like the following could work:

library(data.table)

setDT(df1)

graph_data <- df1[, lapply(.SD, function(x) sum( x == 1 )/ nrow(df1)),
    by = "dummy1",
    .SDcols = c("A","B","C","D","E")] %>%
 melt(id.vars = "dummy1")

library(ggplot2)

ggplot()   
 geom_col(data = graph_data,
          mapping = aes(x = variable, y = value, fill = dummy1),
          position = "dodge")

  • Related