I have a data frame that looks like this
sl_no A_1 A_2 A_3 A_4 A_5 A_6
1 0 0 1 0 1 1
2 1 0 0 1 0 1
3 1 1 0 0 0 0
and so on for about 300 rows. What I want to do is keep only the first '1' in the 'A_' variables in each row. So the final dataset should look like this
sl_no A_1 A_2 A_3 A_4 A_5 A_6
1 0 0 1 0 0 0
2 1 0 0 0 0 0
3 1 0 0 0 0 0
How would I go about this? If else statement in a for loop?
CodePudding user response:
Here's a base R option with a custom function -
keep_only_first_one <- function(x) {
#get the position of first 1
inds <- match(1, x)
#If the positions is not the last one,
#change all the values after 1st one to 0.
if(inds < length(x)) x[(inds 1):length(x)] <- 0
x
}
df[-1] <- t(apply(df[-1], 1, keep_only_first_one))
df
# sl_no A_1 A_2 A_3 A_4 A_5 A_6
#1 1 0 0 1 0 0 0
#2 2 1 0 0 0 0 0
#3 3 1 0 0 0 0 0
This assumes that you want to apply this function to all columns except the 1st one (hence the -1). If you want to select the columns based on it's name you can use -
cols <- grep('^A_', names(df))
df[cols] <- t(apply(df[cols], 1, keep_only_first_one))
df
CodePudding user response:
Another possible solution:
df <- data.frame(
sl_no = c(1L, 2L, 3L),
A_1 = c(0L, 1L, 1L),
A_2 = c(0L, 0L, 1L),
A_3 = c(1L, 0L, 0L),
A_4 = c(0L, 1L, 0L),
A_5 = c(1L, 0L, 0L),
A_6 = c(1L, 1L, 0L)
)
cbind(df[1], t(apply(df[-1], 1,
\(x) {y = which(x == 1); x[1:length(x) != min(y)] <- 0; x})))
#> sl_no A_1 A_2 A_3 A_4 A_5 A_6
#> 1 1 0 0 1 0 0 0
#> 2 2 1 0 0 0 0 0
#> 3 3 1 0 0 0 0 0