I have a list of dfs and a list of annual budgets. Each df represents one business year, and each budget represents a total spend for that year.
# the business year starts from Feb and ends in Jan.
# the budget column is first populated with the % of annual budget allocation
df <- data.frame(monthly_budget=c(0.06, 0.13, 0.07, 0.06, 0.1, 0.06, 0.06, 0.09, 0.06, 0.06, 0.1, 0.15),
month=month.abb[c(2:12, 1)])
# dfs for 3 years
df2019_20 <- df
df2020_21 <- df
df2021_22 <- df
# budgets for 3 years
budget2019_20 <- 6000000
budget2020_21 <- 7000000
budget2021_22 <- 8000000
# into lists
df_list <- list(df2019_20, df2020_21, df2021_22)
budget_list <- list(budget2019_20, budget2020_21, budget2021_22)
I've written the following function to both apply the right year to Jan and fill in the rest by deparsing the respective dfs name. It works perfectly if I supply a single df and a single budget.
budget_func <- function(df, budget){
df_name <- deparse(substitute(df))
df <- df %>%
mutate(year=ifelse(month=="Jan",
as.numeric(str_sub(df_name, -2)) 2000,
as.numeric(str_extract(df_name, "\\d{4}(?=_)")))
)
for (i in 1:12){
df[i,1] <- df[i,1] * budget
i <- i 1
}
return(df)
}
To speed things up I want to pass both lists as arguments to mapply
. However I don't get the results I want - what am I doing wrong?
final_budgets <- mapply(budget_func, df_list, budget_list)
CodePudding user response:
Instead of using deparse/substitute
(which works when we are passing a single dataset, and is different in the loop because the object passed is not the object name), we may add a new argument to pass the names. In addition, when we create the list
, it should have the names as well. We can either use list(df2019_20 = df2019_20, ...)
or use setNames
or an easier option is dplyr::lst
which does return with the name of the object passed
budget_func <- function(df, budget, nm1){
df_name <- nm1
df <- df %>%
mutate(year=ifelse(month=="Jan",
as.numeric(str_sub(df_name, -2)) 2000,
as.numeric(str_extract(df_name, "\\d{4}(?=_)")))
)
for (i in 1:12){
df[i,1] <- df[i,1] * budget
i <- i 1
}
return(df)
}
-testing
df_list <- dplyr::lst(df2019_20, df2020_21, df2021_22)
budget_list <- list(budget2019_20, budget2020_21, budget2021_22)
Map(budget_func, df_list, budget_list, names(df_list))
-output
$df2019_20
monthly_budget month year
1 360000 Feb 2019
2 780000 Mar 2019
3 420000 Apr 2019
4 360000 May 2019
5 600000 Jun 2019
6 360000 Jul 2019
7 360000 Aug 2019
8 540000 Sep 2019
9 360000 Oct 2019
10 360000 Nov 2019
11 600000 Dec 2019
12 900000 Jan 2020
$df2020_21
monthly_budget month year
1 420000 Feb 2020
2 910000 Mar 2020
3 490000 Apr 2020
4 420000 May 2020
5 700000 Jun 2020
6 420000 Jul 2020
7 420000 Aug 2020
8 630000 Sep 2020
9 420000 Oct 2020
10 420000 Nov 2020
11 700000 Dec 2020
12 1050000 Jan 2021
$df2021_22
monthly_budget month year
1 480000 Feb 2021
2 1040000 Mar 2021
3 560000 Apr 2021
4 480000 May 2021
5 800000 Jun 2021
6 480000 Jul 2021
7 480000 Aug 2021
8 720000 Sep 2021
9 480000 Oct 2021
10 480000 Nov 2021
11 800000 Dec 2021
12 1200000 Jan 2022