I have a dataset structured as follows, where I am tracking collective action mentions by subReddit by month, relative to a policy treatment which is introduced in Feb 17th, 2012. As a result, the period "Feb 2012" appears twice in my dataset where the "pre" period refers to the Feb 2012 days before treatment, and "post" otherwise.
treatment_status month_year collective_action_percentage
pre Dec 2011 5%
pre Jan 2012 8%
pre Feb 2012 10%
post Feb 2012 3%
post March 2012 10%
However, I am not sure how to best visualize this indicator by month, but I made the following graph but I was wondering if presenting this pattern/variable by week&year, rather than month&year basis would be clearer if I am interested in showing how collective action mentions decline after treatment?
ggplot(data = df1, aes(x = as.Date(month_year), fill = collective_action_percentage ,y = collective_action_percentage))
geom_bar(stat = "identity", position=position_dodge())
scale_x_date(date_breaks = "1 month", date_labels = "%b %Y")
scale_y_continuous(labels = scales::percent_format(accuracy = 1))
xlab("Criticism by individuals active before and after treatment")
theme_classic()
theme(plot.title = element_text(size = 10, face = "bold"),
axis.text.x = element_text(angle = 90, vjust = 0.5))
I created the month_year variable as follows using the Zoo package
df<- df %>%
mutate(month_year = zoo::as.yearmon(date))
Finally, I tried aggregating the data by weekly-basis as follows, however, given that I have multiple years in my dataset, I want to ideally aggregate data by week&year, and not simply by week
df2 %>% group_by(week = isoweek(time)) %>% summarise(value = mean(values))
CodePudding user response:
Plot a point for each row and connect them with a line so that it is clear what the order is. We also color the pre and post points differently and make treatment status a factor so that we can order the pre level before the post level.
library(ggplot2)
library(zoo)
df2 <- transform(df1, month_year = as.yearmon(month_year, "%b %Y"),
treatment_status = factor(treatment_status, c("pre", "post")))
ggplot(df2, aes(month_year, collective_action_percentage))
geom_point(aes(col = treatment_status), cex = 4)
geom_line()
Note
We assume df1 is as follows. We have already removed % .
df1 <-
structure(list(treatment_status = c("pre", "pre", "pre", "post",
"post"), month_year = c("Dec 2011", "Jan 2012", "Feb 2012", "Feb 2012",
"March 2012"), collective_action_percentage = c(5L, 8L, 10L,
3L, 10L)), class = "data.frame", row.names = c(NA, -5L))