I'm attempting to create a composite plot in r, the code for which is below:
#Adding initial data
ggp <- ggplot(NULL, aes(x = date, y = covid))
geom_spline(data = onsdf,
aes(x = date, y = covid, colour = "ONS Modelled Estimates"), nknots = 90, size = 1.3)
geom_spline(data = gvtdf,
aes(x = date, y = covid, colour = "Gvt Reported Positive Tests"), nknots = 90, size = 1.3)
#Creating function to add stringency bars
barfunction <- function(date1, date2, alpha){
a <- annotate(geom = "rect",
xmin = as.Date(date1), xmax = as.Date(date2), ymin = 0, ymax = Inf, alpha = alpha, fill = "red")
return(a)
}
#Adding lockdown stringency bars
ggp <- ggp
barfunction("2020-05-03", "2020-06-01", 0.5)
barfunction("2020-06-01", "2020-06-15", 0.4)
barfunction("2020-06-15", "2020-09-14", 0.3)
barfunction("2020-09-14", "2020-11-05", 0.3)
barfunction("2020-11-05", "2020-12-02", 0.5)
barfunction("2020-12-02", "2021-01-06", 0.4)
barfunction("2021-01-06", "2021-03-29", 0.5)
barfunction("2021-03-29", "2021-04-12", 0.4)
barfunction("2021-04-12", "2021-05-17", 0.3)
barfunction("2021-05-17", "2021-07-19", 0.2)
barfunction("2021-07-19", "2021-12-08", 0.1)
barfunction("2021-12-08", "2022-02-24", 0.2)
#Adding plot labels
ggp <- ggp labs(title = "Estimated Total Covid-19 Cases vs Reported Positive Cases",
subtitle = "From ONS and HMGvt datasets",
x = "Date (year - month)", y = "Covid Levels")
scale_y_continuous(labels = scales::comma)
scale_x_date(limits = as.Date(c("2020-05-03", NA )))
scale_colour_manual(name = "Measurement Method",
values = c("ONS Modelled Estimates"="purple",
"Gvt Reported Positive Tests" = "blue"))
The output of this code looks like this:
As you can see, I have a very repetitive function (barfunction) in this code that I would like to change. I thought the best way to do this was to convert the data barfunction() was applying to the graph into a dataframe, and then try to use a function on said data frame. Here is a head of the data frame (called strindf)
date1 date2 alpha
2020-05-03 2020-06-01 0.5
2020-06-01 2020-06-15 0.4
2020-06-15 2020-09-14 0.3
2020-09-14 2020-11-05 0.3
I initially tried to use apply() to add the strindf data to my plot, however I got an error message (Error in as.Date(date2) : argument "date2" is missing, with no default). Here is how I implemented it into the original code
ggptest <- ggplot(NULL, aes(x = date, y = covid))
geom_spline(data = onsdf,
aes(x = date, y = covid, colour = "ONS Modelled Estimates"), nknots = 90, size = 1.3)
geom_spline(data = gvtdf,
aes(x = date, y = covid, colour = "Gvt Reported Positive Tests"), nknots = 90, size = 1.3)
apply(strindf, MARGIN = 1 , barfunction)
theme_minimal()
scale_y_continuous(labels = scales::comma)
scale_x_date(limits = as.Date(c("2020-05-03", NA )))
scale_colour_manual(name = "Legend",
I'm quite new to r so I'm a bit stumped, does anyone have any suggestions?
Thanks in advance!
CodePudding user response:
Your idea was right. But you have chosen the wrong function from the apply
family of functions. As you have a function of multiple arguments use mapply
or as I do below purrr::pmap
:
Using some fake random example data:
library(ggplot2)
library(ggformula)
barfunction <- function(date1, date2, alpha) {
annotate(geom = "rect", xmin = as.Date(date1), xmax = as.Date(date2), ymin = 0, ymax = Inf, alpha = alpha, fill = "red")
}
ggplot(NULL, aes(x = date, y = covid))
geom_spline(data = df, aes(colour = "ONS Modelled Estimates"), nknots = 90, size = 1.3)
purrr::pmap(strindf, barfunction)
theme_minimal()
scale_y_continuous(labels = scales::comma)
scale_x_date(limits = as.Date(c("2020-05-03", NA)))
scale_colour_manual(
name = "Measurement Method",
values = c(
"ONS Modelled Estimates" = "purple",
"Gvt Reported Positive Tests" = "blue"
)
)
#> Warning: Removed 123 rows containing non-finite values (stat_spline).
DATA
set.seed(123)
df <- data.frame(
date = seq.Date(as.Date("2020-01-01"), as.Date("2020-12-31"), by = "day"),
covid = runif(366)
)
strindf <- structure(list(date1 = c(
"2020-05-03", "2020-06-01", "2020-06-15",
"2020-09-14"
), date2 = c(
"2020-06-01", "2020-06-15", "2020-09-14",
"2020-11-05"
), alpha = c(0.5, 0.4, 0.3, 0.3)), class = "data.frame", row.names = c(
NA,
-4L
))