Home > Back-end >  Mutate multiple dataframe columns where cell content depends on column name
Mutate multiple dataframe columns where cell content depends on column name


I'm trying to replace binary information in dataframe columns with strings that refer to the columns' names.

My data looks like this (just with more natXY columns and some additional variables):

    df <- data.frame(id = c(1:5), natAB = c(1,0,0,0,1), natCD = c(0,1,0,0,0), natother = c(0,0,1,1,0), var1 = runif(5, 1, 10))

All column names in question start with "nat", mostly followed by two letters although some contain a different number of characters.

For a single column, the following code achieves the desired outcome:

    df %>% mutate(natAB = ifelse(natAB == 1, "AB", NA)) -> df

Now I need to generalise this line in order to apply it to the other columns using the mutate() and across() functions.

I imagine something like this

    df %>% mutate(across(natAB:natother, ~ ifelse(
                  . == 1, paste(substr(colnames(.), start = 4, stop = nchar(colnames(.)))), NA))) -> df

... but end up with all my "nat" columns filled with NA. How do I reference the column name correctly in this code structure?

Any help is much appreciated.

CodePudding user response:

You can use cur_column to refer to the column name in an across call, and then use str_remove:

df %>% 
                ~ ifelse(.x == 1, str_remove(cur_column(), "nat"), NA)))

#   id natAB natCD natother     var1
# 1  1    AB  <NA>     <NA> 7.646891
# 2  2  <NA>    CD     <NA> 4.704543
# 3  3  <NA>  <NA>    other 7.717925
# 4  4  <NA>  <NA>    other 3.367320
# 5  5    AB  <NA>     <NA> 8.455011
  • Related