Assign multiple row values based on condition-CodePudding

I have a data frame that looks like this

   sl_no      A_1     A_2    A_3     A_4     A_5      A_6
    1          0       0      1       0       1        1
    2          1       0      0       1       0        1
    3          1       1      0       0       0        0

and so on for about 300 rows. What I want to do is keep only the first '1' in the 'A_' variables in each row. So the final dataset should look like this

  sl_no      A_1     A_2    A_3     A_4     A_5      A_6
    1          0       0      1       0       0        0
    2          1       0      0       0       0        0
    3          1       0      0       0       0        0

How would I go about this? If else statement in a for loop?

CodePudding user response：

Here's a base R option with a custom function -

keep_only_first_one <- function(x) {
  #get the position of first 1
  inds <- match(1, x)
  #If the positions is not the last one, 
  #change all the values after 1st one to 0.
  if(inds < length(x)) x[(inds   1):length(x)] <- 0
  x
}
df[-1] <- t(apply(df[-1], 1, keep_only_first_one))
df

#  sl_no A_1 A_2 A_3 A_4 A_5 A_6
#1     1   0   0   1   0   0   0
#2     2   1   0   0   0   0   0
#3     3   1   0   0   0   0   0

This assumes that you want to apply this function to all columns except the 1st one (hence the -1). If you want to select the columns based on it's name you can use -

cols <- grep('^A_', names(df))
df[cols] <- t(apply(df[cols], 1, keep_only_first_one))
df

CodePudding user response：

Another possible solution:

df <- data.frame(
  sl_no = c(1L, 2L, 3L),
  A_1 = c(0L, 1L, 1L),
  A_2 = c(0L, 0L, 1L),
  A_3 = c(1L, 0L, 0L),
  A_4 = c(0L, 1L, 0L),
  A_5 = c(1L, 0L, 0L),
  A_6 = c(1L, 1L, 0L)
)

cbind(df[1], t(apply(df[-1], 1, 
      \(x) {y = which(x == 1); x[1:length(x) != min(y)] <- 0; x})))

#>   sl_no A_1 A_2 A_3 A_4 A_5 A_6
#> 1     1   0   0   1   0   0   0
#> 2     2   1   0   0   0   0   0
#> 3     3   1   0   0   0   0   0