Home > Net >  User-defined function that will rename a variable in a named column in R
User-defined function that will rename a variable in a named column in R

Time:01-26

I'm trying to write a function that will rename variables across multiple columns inside of a data table in R.

My data table is structured similar to this:

feature1 feature2 feature3 feature4
var_a var_c var_b var_a
var_b var_a var_a var_c
var_c var_b var_c var_b

I'm trying to rename all the variables to new name. Some of them are in feature1 for one item, but in feature4 for another item but the naming would be the same across the data frame.

feature1 feature2 feature3 feature4
new_a new_c new_b new_a
new_b new_a new_a new_c
new_c new_b new_c new_b

I'm just having trouble writing my own user-defined function to do this in less lines of code than a standard dat$feature1\[dat$feature1 == 'var_a'\] \*\<- '\*new_a'.

Preferably I'd like to pass through something along the lines of function(dat, var_a, new_a) or something where I can just pass through a list of my old and new variables.

Any help would be appreciated. Thank you!

CodePudding user response:

In base R:

df[] <- lapply(df, function(x) gsub("var","new", x))

Output:

#   feature1 feature2 feature3 feature4
# 1    new_a    new_c    new_b    new_a
# 2    new_b    new_a    new_a    new_c
# 3    new_c    new_b    new_c    new_b 

Data

df <- read.table(text = "feature1   feature2    feature3    feature4
var_a   var_c   var_b   var_a
var_b   var_a   var_a   var_c
var_c   var_b   var_c   var_b", header = TRUE)
df <- data.table::data.table(df)

CodePudding user response:

This is a function that takes in a data frame, the old string and its replacement.

library(tidyverse)

replace_func <- function(df, var, new_var) {
  df %>%
    mutate(across(everything(), ~ .x %>%
                    str_replace_all(var, new_var)))
}

replace_func(df, "var", "new")

# A tibble: 3 × 4
  feature1 feature2 feature3 feature4
  <chr>    <chr>    <chr>    <chr>   
1 new_a    new_c    new_b    new_a   
2 new_b    new_a    new_a    new_c   
3 new_c    new_b    new_c    new_b

CodePudding user response:

df[, lapply(.SD, sub, pattern = "var", replacement = "new", fixed = TRUE)]
#    feature1 feature2 feature3 feature4
# 1:    new_a    new_c    new_b    new_a
# 2:    new_b    new_a    new_a    new_c
# 3:    new_c    new_b    new_c    new_b

Using this sample data:

library(data.table)
df = fread(text = 'feature1     feature2    feature3    feature4
var_a   var_c   var_b   var_a
var_b   var_a   var_a   var_c
var_c   var_b   var_c   var_b')

CodePudding user response:

Using dplyr and across

library(dplyr)

df %>% 
  mutate(across(1:4, ~ sub(".*(_)", "new\\1", .x)))
  feature1 feature2 feature3 feature4
1    new_a    new_c    new_b    new_a
2    new_b    new_a    new_a    new_c
3    new_c    new_b    new_c    new_b
  • Related