Add a new column based on change in values in other columns-CodePudding

I have the following dataframe:

DF <- data.frame(Col1=c(0,0,1),Col2=c(0,1,1),Col3=c(1,0,1))

	Col1	Col2	Col3
1	0	0	1
2	0	1	0
3	1	1	1

I need to add a new column "Switch" that contains the name of the variable at which the value of the row has changed for the first time, so the output looks like this:

	Col1	Col2	Col3	Switch
1	0	0	1	Col3
2	0	1	0	Col2
3	1	1	1	NA

Any guidance or help will be appreciated. Thank you.

CodePudding user response：

We may use max.col

tmp <- names(DF)[max.col(DF, 'first')]
tmp[rowSums(DF == 1) == ncol(DF)|rowSums(DF == 0) == ncol(DF)] <- NA
DF$Switch <- tmp

-output

> DF
  Col1 Col2 Col3 Switch
1    0    0    1   Col3
2    0    1    0   Col2
3    1    1    1   <NA>

CodePudding user response：

You may write a function with diff and apply it rowwise.

switch_col <- function(x) {
  cols[which(diff(x) != 0)[1]   1]
}
cols <- names(DF)
DF$switch_col <- apply(DF, 1, switch_col)
DF

#  Col1 Col2 Col3 switch_col
#1    0    0    1       Col3
#2    0    1    0       Col2
#3    1    1    1       <NA>

You may also use dplyr -

library(dplyr)

DF %>%
  rowwise() %>%
  mutate(switch_col = switch_col(c_across())) %>%
  ungroup