I currently have data which looks like the following:
ID | Var_1 | Var_2 | Var_3 | Var_4 | Var_5 | RemovePrior |
---|---|---|---|---|---|---|
1 | 20 | 30 | 25 | 35 | 40 | 3 |
2 | 40 | 50 | 45 | 55 | 60 | 2 |
3 | 60 | 70 | 65 | 75 | 80 | 4 |
4 | 80 | 90 | 85 | 95 | 85 | 5 |
df <- data.frame(ID = c(1, 2, 3, 4),
Var_1 = c(20, 40, 60, 80),
Var_2 = c(30, 50, 70, 90),
Var_3 = c(25, 45, 65, 85),
Var_4 = c(35, 55, 75, 95),
Var_5 = c(40, 60, 80, 85),
RemovePrior = c(3, 2, 4, 5))
My goal is to—for each row—fill in the variables to the left of the variable with the suffix indicated by "RemovePrior" with NA. For example, for ID = 1, RemovePrior takes on the value 3, hence I wish to have Var_1 and Var_2 set to NA. The following is the intended result:
ID | Var_1 | Var_2 | Var_3 | Var_4 | Var_5 | RemovePrior |
---|---|---|---|---|---|---|
1 | NA | NA | 25 | 35 | 40 | 3 |
2 | NA | 50 | 45 | 55 | 60 | 2 |
3 | NA | NA | NA | 75 | 80 | 4 |
4 | NA | NA | NA | NA | 85 | 5 |
CodePudding user response:
We can use base R
nm1 <- startsWith(names(df), "Var")
df[nm1] <- NA^(col(df[nm1]) < df$RemovePrior) * df[nm1]
-output
> df
ID Var_1 Var_2 Var_3 Var_4 Var_5 RemovePrior
1 1 NA NA 25 35 40 3
2 2 NA 50 45 55 60 2
3 3 NA NA NA 75 80 4
4 4 NA NA NA NA 85 5
Or with dplyr
library(dplyr)
df %>%
mutate(across(starts_with("Var_"),
~ case_when(readr::parse_number(cur_column()) >= RemovePrior ~ .x )))
ID Var_1 Var_2 Var_3 Var_4 Var_5 RemovePrior
1 1 NA NA 25 35 40 3
2 2 NA 50 45 55 60 2
3 3 NA NA NA 75 80 4
4 4 NA NA NA NA 85 5
CodePudding user response:
A possible solution:
library(tidyverse)
df %>%
mutate(across(Var_1:Var_5,
~ if_else(which(cur_column() == names(df[-1])) < RemovePrior, NA_real_, .x)))
#> ID Var_1 Var_2 Var_3 Var_4 Var_5 RemovePrior
#> 1 1 NA NA 25 35 40 3
#> 2 2 NA 50 45 55 60 2
#> 3 3 NA NA NA 75 80 4
#> 4 4 NA NA NA NA 85 5
CodePudding user response:
Here is another way how we could do it:
Using pivot_longer
and pivot_wider
library(dplyr)
library(tidyr)
df %>%
pivot_longer(
-c(ID, RemovePrior)
) %>%
group_by(ID) %>%
mutate(value = ifelse(value == lag(value, unique(RemovePrior)-1), NA, value)) %>%
pivot_wider(
names_from = name,
values_from = value
)
output:
ID RemovePrior Var_1 Var_2 Var_3 Var_4 Var_5
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 3 NA NA 25 35 40
2 2 2 NA 50 45 55 60
3 3 4 NA NA NA 75 80
4 4 5 NA NA NA NA 85