Adding a new column next to each existing column that matches a certain column name pattern in R / t-CodePudding

In a dataframe I want to add a new column next each column whose name matches a certain pattern, for example whose name starts with "ip_" and is followed by a number. The name of the new columns should follow the pattern "newCol_" suffixed by that number again. The values of the new columns should be NA's.

So this dataframe:

should be transformed to that dataframe:

A tidiverse solution with use of regex is much appreciated!

Sample data:

df <- data.frame(
  ID = c("1", "2"),
  ip_1 = c(2,3),
  ip_9 = c(5,7),
  ip_39 = c(11,13),
  in_1 = c("B", "D"),
  in_2 = c("A", "H"),
  in_3 = c("D", "A")
)

CodePudding user response：

To get the columns is easy with across -

library(dplyr)

df %>%
  mutate(across(starts_with('ip'), ~NA, .names = '{sub("ip", "newCol", .col)}'))

#  ID ip_1 ip_9 ip_39 in_1 in_2 in_3 newCol_1 newCol_9 newCol_39
#1  1    2    5    11    B    A    D       NA       NA        NA
#2  2    3    7    13    D    H    A       NA       NA        NA

To get the columns in required order -

library(dplyr)

df %>%
  mutate(across(starts_with('ip'), ~NA, .names = '{sub("ip", "newCol", .col)}')) %>%
  select(ID, starts_with('in'), 
        order(suppressWarnings(readr::parse_number(names(.))))) %>%
  select(ID, ip_1:newCol_39, everything())

#  ID ip_1 newCol_1 ip_9 newCol_9 ip_39 newCol_39 in_1 in_2 in_3
#1  1    2       NA    5       NA    11        NA    B    A    D
#2  2    3       NA    7       NA    13        NA    D    H    A

CodePudding user response：

To add the new NA columns :

df[, sub("^ip", "newCol", grep("^ip", names(df), value = TRUE))] <- NA

To reorder them :

df <- df[, order(c(grep("newCol", names(df), invert = TRUE), grep("^ip", names(df))))]