I'd like to replace with NA certain columns depending on the start_value and the end_value. For example:
- the first row will stay like this because start_value and end_value are missing;
- the second row will be replaced with missing, starting from the first column (value.1) to the third (value.3);
- the third row will be replaced with missing, starting from the second column (value.2) to the third (value.3), etc.
Example dataframe:
df <- data.frame(id=c(1:4),value.1=c(0,1,1,0), value.2=c(rep(0,3),1), value.3=c(1,1,1,0), start_value=c(NA,1,2,NA),end_value=c(NA,3,3,NA))
id value.1 value.2 value.3 start_value end_value
1 0 0 1 NA NA
2 1 0 1 1 3
3 1 0 1 2 3
4 0 1 0 NA NA
I'd like to obtain a final df like this:
id value.1 value.2 value.3
1 0 0 1
2 NA NA NA
3 1 NA NA
4 0 1 0
CodePudding user response:
Here's a way using apply
:
as.data.frame(
t(apply(df, 1, function(x) {
start <- x[5]
end <- x[6]
if (anyNA(c(start, end))){
x[1:4]
} else {
x[2:4][start:end] <- NA
x[1:4]
}
})))
id value.1 value.2 value.3
1 1 0 0 1
2 2 NA NA NA
3 3 1 NA NA
4 4 0 1 0
CodePudding user response:
A base R
approach with row/column
indexing
i1 <- complete.cases(df[5:6])
lst1 <- do.call(Map, c(f = `:`, unname(df[i1,5:6])))
df[i1, 2:4][cbind(rep(seq_along(lst1), lengths(lst1)), unlist(lst1))] <- NA
-output
> df[1:4]
id value.1 value.2 value.3
1 1 0 0 1
2 2 NA NA NA
3 3 1 NA NA
4 4 0 1 0
Or using tidyverse
library(dplyr)
library(stringr)
nm1 <- str_subset(names(df), "value\\.")
df %>%
transmute(id, across(starts_with('value'),
~ {
ind <- match(cur_column(), nm1)
replace(.x, ind >= start_value & ind <=end_value, NA)
}))
-output
id value.1 value.2 value.3
1 1 0 0 1
2 2 NA NA NA
3 3 1 NA NA
4 4 0 1 0