Home > OS >  How do I rename columns in tidyverse with vectors of names
How do I rename columns in tidyverse with vectors of names

Time:08-17

I wrote the following (working) function to change column names from vectors that contain both the current and desired column names,

change.name <- function (dt, from, to) 
{
    loc <- match(from, names(dt))
    chg.loc <- loc[!is.na(loc)]
    if (length(chg.loc) == 0) 
        return(dt)
    names(dt)[chg.loc] = to[!is.na(loc)]
    return(dt)
}

Is it possible to replace this function with rename or some other part of dplyr. I would rather not need my own function.

Here is an example of the desired functionality,

cnames = tibble(from = c("hair_color", "banana", "height"),
                to = c("HeadCap", "Orange", "VertMetric"))
starwars %>% select(name, height, mass, hair_color, skin_color) %>% 
  top_n(5) %>% change.name(cnames$from, cnames$to)
  name            VertMetric  mass HeadCap skin_color 
  <chr>                <int> <dbl> <chr>   <chr>      
1 R2-D2                   96  32   NA      white, blue
2 R5-D4                   97  32   NA      white, red 
3 Gasgano                122  NA   none    white, blue
4 Luminara Unduli        170  56.2 black   yellow     
5 Barriss Offee          166  50   black   yellow

Note that "banana" in cnames$from is missing from starwars and doesn't trip up the function.

CodePudding user response:

You can use rename_with, but in this implementation below, I still need to index to and from on the existing names

idx = cnames$from %in% names(starwars)

starwars %>% select(name, height, mass, hair_color, skin_color) %>% 
  top_n(5) %>% 
  rename_with(~cnames$to[idx], cnames$from[idx])

CodePudding user response:

As per this answer, you can use

rename(any_of(setNames(cnames$from, cnames$to)))

(Note that I am not closing this question as a duplicate, as I think the table-structure of the names lookup is a common case and an important distinction.)

I do wish there was an interface that was more expressive, like in data.table you can do setnames(data, old = from, new = to, skip_absent = TRUE). We could roll our own, but that defeats the purpose of "not needing my own function". But I would imitate data.table's syntax:

rename_from_vec = function(data, old, new, skip_absent = FALSE) {
  if(skip_absent) {
    rename(data, any_of(setNames(old, new)))
  } else {
    rename(data, all_of(setNames(old, new)))
  }
}

CodePudding user response:

One choice is to use rename_with() recode() and pass a named vector for unquote splicing with !!!.

starwars %>%
  select(name, height, mass, hair_color, skin_color) %>% 
  rename_with(~ recode(.x, !!!setNames(cnames$to, cnames$from)))

# # A tibble: 87 × 5
#    name               VertMetric  mass HeadCap       skin_color 
#    <chr>                   <int> <dbl> <chr>         <chr>      
#  1 Luke Skywalker            172    77 blond         fair       
#  2 C-3PO                     167    75 NA            gold       
#  3 R2-D2                      96    32 NA            white, blue
#  4 Darth Vader               202   136 none          white      
#  5 Leia Organa               150    49 brown         light
#  ...
  • Related