Dplyr mutate minimum column name-CodePudding

I have a df with many columns but for this issue am focusing on four date columns A, B, C, D. I am using the following to create a column for the minimum date for that row.

mutate(MIN_DATE = pmin(A,B,C,D))

To create a column for the minimum date for that row.

  A         B       C       D       MIN_DATE
1-1-20   1-2-20  1-10-20  2-10-20    1-1-20
1-2-20   1-4-20  1-11-20  2-1-20     1-2-20
1-3-20   1-5-20  1-20-20  2-3-20     1-3-20

What I would like to do is add another column which is the column name of the minimum date which would look like :

  A         B       C       D       MIN_DATE   MIN_COL
1-1-20   1-2-20  1-10-20  2-10-20    1-1-20       A
1-2-20   1-4-20  1-11-20  2-1-20     1-2-20       A
1-3-20   1-5-20  1-2-20   2-3-20     1-3-20       C

What is the easiest way to accomplish this, some combination of names and which.min()?

CodePudding user response：

We can use which.min to get the index of minimum date and use that to subset the names after doing rowwise

library(dplyr)
library(lubridate)
df %>%
   # in case the dates are not Date class
   mutate(across(c(A, B, C, D), mdy)) %>%
   rowwise %>%
   mutate(MIN_COL = names(.)[1:4][which.min(c_across(c(A, B, C, D)))]) %>%
   ungroup

-output

# A tibble: 3 × 5
  A          B          C          D          MIN_COL
  <date>     <date>     <date>     <date>     <chr>  
1 2020-01-01 2020-01-02 2020-01-10 2020-02-10 A      
2 2020-01-02 2020-01-04 2020-01-11 2020-02-01 A      
3 2020-01-03 2020-01-05 2020-01-20 2020-02-03 A

Or probably a faster option with max.col from base R

df$MIN_COL <- c("A", "B", "C", "D")[max.col(-sapply(
    df[c("A", "B", "C", "D")], \(x) as.Date(x, "%m-%d-%y")), "first")]