it is hard to reuse code in functions. I subset .data with [[ however received splicing error. i provide an example and the a solution using an "if" statement within a tidy function below. Is it possible use variable masking in tidyverse programming?
data frame
set.seed(123)
(df=data.frame(
Yrs_Before=sample(1:8, 3),
Yrs_After=sample(1:8, 3),
Before.Yr_1=sample(1:8, 3),
Before.Yr_2=sample(1:8, 3),
Before.Yr_3=sample(1:8, 3),
Before.Yr_4=sample(1:8, 3),
Before.Yr_5=sample(1:8, 3),
Before.Yr_6=sample(1:8, 3),
Before.Yr_7=sample(1:8, 3),
Before.Yr_8=sample(1:8, 3),
After.Yr_1=sample(1:8, 3),
After.Yr_2=sample(1:8, 3),
After.Yr_3=sample(1:8, 3),
After.Yr_4=sample(1:8, 3),
After.Yr_5=sample(1:8, 3),
After.Yr_6=sample(1:8, 3),
After.Yr_7=sample(1:8, 3),
After.Yr_8=sample(1:8, 3)
))
is it possible to use variable masking for the following function?
sums=function(data,crashes,yrs){
data %>%
dplyr::rowwise() %>%
dplyr::transmute(sum = cumsum(c_across(matches(.data[[crashes]])))[.data[[yrs]]])
}
however an error was recieved.
sums(df,"After.Yr")
Error in splice(dot_call(capture_dots, frame_env = frame_env, named = named, :
argument "yrs" is missing, with no default
Called from: splice(dot_call(capture_dots, frame_env = frame_env, named = named,
ignore_empty = ignore_empty, unquote_names = unquote_names,
homonyms = homonyms, check_assign = check_assign))
similarly with for counts occuring during the respective "before year" periods (eg. "Before.Yr.").
sums(df,"Before.Yr")
Error in splice(dot_call(capture_dots, frame_env = frame_env, named = named, :
argument "yrs" is missing, with no default
Called from: splice(dot_call(capture_dots, frame_env = frame_env, named = named,
ignore_empty = ignore_empty, unquote_names = unquote_names,
homonyms = homonyms, check_assign = check_assign))
the following was accomplished using an "if" statement, which provides the desired results. The desired results are provided below for the "before" (Before.Yr) and "after"(After.Yr) periods
sums = function(data,counts){
data %>%
dplyr::rowwise() %>%
dplyr::transmute(sums = if(counts=="Before.Yr") {cumsum(c_across(matches('Before.Yr')))[Yrs_Before]} else{cumsum(c_across(matches('After.Yr')))[Yrs_After]})}
using crashes in the after period.
sums(df,"After.Yr")
# A tibble: 3 × 1
# Rowwise:
sums
<int>
1 21
2 20
3 6
using crashes in the before period.
> sums(df,"Before.Yr")
# A tibble: 3 × 1
# Rowwise:
sums
<int>
1 23
2 33
3 11
CodePudding user response:
Instead of using matches(.data[[crashes]])
simply do matches(crashes)
and of course do you have to pass a column name for yrs
:
library(dplyr)
sums <- function(data, crashes, yrs) {
data %>%
dplyr::rowwise() %>%
dplyr::transmute(sum = cumsum(c_across(matches(crashes)))[.data[[yrs]]])
}
sums(df, "After.Yr", "Yrs_After")
#> # A tibble: 3 × 1
#> # Rowwise:
#> sum
#> <int>
#> 1 21
#> 2 20
#> 3 6
sums(df, "Before.Yr", "Yrs_Before")
#> # A tibble: 3 × 1
#> # Rowwise:
#> sum
#> <int>
#> 1 23
#> 2 33
#> 3 11