Home > Software engineering >  Replace multiple strings in column names of r
Replace multiple strings in column names of r

Time:10-12

I am trying to do a string replacement for column names based on a different patterns. I want to replace one pattern with a new string and another pattern with another string, but for a number of different string patterns.

As an example:

I have a data frame that includes columns that start with similar strings (e.g. "Q1", "Q2", "Q3", "Q4", etc.) and end with another set of similar strings (e.g. "multiple_choice", "true_false", "fill_in", etc.).

So, I have a column "Q1_multiple_choice", "Q1_true_false", "Q1_fill_in", "Q2_multiple_choice", "Q2_true_false", "Q2_fill_in", etc. The columns are not in a specific order, so I can't just list out the new names of the columns.

I would like to replace "Q1" with "Math", "Q2" with "Science", "Q3" with "History", etc. So it would look like - "Math_multiple_choice", "Math_true_false", "Math_fill_in", "Science_multiple_choice", "Science_true_false", "Science_fill_in", etc.

Some code to create those random column names -


test_col_names <-
  paste0("Q",
         sample.int(5, 20, replace = TRUE),
         "_",
         sample(c(
           "multiple_choice", "true_false", "fill_in"
         ), 20, replace = TRUE))
df = data.frame(matrix(nrow = 0, ncol = length(test_col_names)))
colnames(df) = test_col_names
df

I tried -

df  %>% set_names( ~ (.) %>%
                      str_replace_all(c("Q1",
                                        "Q2" ,
                                        "Q3")
                                      , c("Math",
                                          "Science",
                                          "History")))

But it did not replace all of them, just a few.

I was also wondering if there was a way to do this with a list. For example, I would create a list -

new_names <- list(Q1 = "Math",
                  Q2 = "Science",
                  Q3 = "History")

Or even with a data frame because I have Q1-Q50!

CodePudding user response:

use a named vector rather than a list:

new_names <- c(Q1 = "Math",
                  Q2 = "Science",
                  Q3 = "History")
df %>%
 set_names(~str_replace_all(.x,new_names))
  • Related