My data looks like this:
library(tidyverse)
#> Warning: package 'readr' was built under R version 4.1.2
set.seed(123)
dat <- tibble(
mcq_option_1 = sample(c(0,1), replace = TRUE, size = 100),
mcq_option_2 = sample(c(0,1), replace = TRUE, size = 100),
mcq_option_3 = sample(c(0,1), replace = TRUE, size = 100),
mcq_option_4 = sample(c(0,1), replace = TRUE, size = 100),
mcq_option_5 = sample(c(0,1), replace = TRUE, size = 100)
)
Created on 2022-02-21 by the reprex package (v2.0.1)
My goal is to create a barplot of the times each mcq option was selected (counts of 1s in each column). What is the quickest data manipulation technique to achieve this? Should I pivot longer first and then create a new count variable using mutate? Or can I quickly do this inside of the ggplot function?
Is there a shorter/better solution than this?
new <- dat %>% pivot_longer(
cols = everything(),
names_to = 'options',
values_to = 'active'
)
new %>% filter(active == 1) %>% group_by(options) %>% summarise(n = n())
CodePudding user response:
Not sure if you're looking for better execution speed or simply shorter code. You can at least get the code a bit shorter by first summing across columns, then pivoting.
dat %>%
summarise(across(.fns = sum)) %>%
pivot_longer(everything()) %>%
ggplot(aes(name, value))
geom_col()
CodePudding user response:
You don't need to perform the summarize()
after pivot
ing because by default geom_col()
uses stat_identity()
which will stack all the values. Therefore you don't need to filter because the 0
values will just drop out.
library(tidyverse)
set.seed(123)
dat <- tibble(
mcq_option_1 = sample(c(0,1), replace = TRUE, size = 100),
mcq_option_2 = sample(c(0,1), replace = TRUE, size = 100),
mcq_option_3 = sample(c(0,1), replace = TRUE, size = 100),
mcq_option_4 = sample(c(0,1), replace = TRUE, size = 100),
mcq_option_5 = sample(c(0,1), replace = TRUE, size = 100)
)
dat %>%
pivot_longer(cols = everything(),
names_to = 'options',
values_to = 'active') %>%
ggplot(aes(options, active))
geom_col()
Created on 2022-02-21 by the reprex package (v2.0.1)