Home > Software engineering >  Using a vector of new and old names in tidyverse to select and rename columns
Using a vector of new and old names in tidyverse to select and rename columns

Time:10-20

I want to select columns and rename them based on the names of the columns which I keep in a separate dataframe.

This is the original dataset:

df <- tribble(
  ~year, ~country, ~series1, ~series2, 
  2003, "USA", 8, 5,
  2004, "USA", 9, 6, 
  2005, "USA", 11, 7, 
  2006, "USA", 10, 8
)

I want to select and rename two columns and I want to specify that like this:

specs <- tribble(
  ~old_name, ~new_name, 
  "country", "region",
  "series1", "gdp_growth"
)

I want this result:

expected_df <- tribble(
  ~region, ~gdp_growth,
  "USA", 8, 
  "USA", 9, 
  "USA", 11, 
  "USA", 10
)

This does not work:

df %>% 
  select(specs$new_name = specs$old_name)

Error: unexpected '=' in: "df %>% select(specs$new_name ="

CodePudding user response:

df %>%  
  select(specs$old_name) %>% 
  rename_with(~specs$new_name, specs$old_name)

#> # A tibble: 4 x 2
#>   region gdp_growth
#>   <chr>       <dbl>
#> 1 USA             8
#> 2 USA             9
#> 3 USA            11
#> 4 USA            10

CodePudding user response:

How about this?

df %>% 
  select(specs$old_name) %>% set_names(specs$new_name)

CodePudding user response:

We could do

library(dplyr)
library(tibble)
df %>% 
   select(!!! deframe(specs[2:1]))

-output

# A tibble: 4 × 2
  region gdp_growth
  <chr>       <dbl>
1 USA             8
2 USA             9
3 USA            11
4 USA            10

CodePudding user response:

How about:

library(tidyverse)


df |>
  rename_with(.cols = specs$old_name, .fn = \(x) specs$new_name) |>
  select(!!!syms(specs$new_name))
#> # A tibble: 4 x 2
#>   region gdp_growth
#>   <chr>       <dbl>
#> 1 USA             8
#> 2 USA             9
#> 3 USA            11
#> 4 USA            10

EDIT:

or another option:

map2_dfc(specs$new_name, 
         specs$old_name, 
         \(new, old) select(df, !!sym(new) := !!sym(old)))
#> # A tibble: 4 x 2
#>   region gdp_growth
#>   <chr>       <dbl>
#> 1 USA             8
#> 2 USA             9
#> 3 USA            11
#> 4 USA            10

CodePudding user response:

Two more similar but slightly different ways to those already proposed.

df %>% select(specs$old_name) %>% setNames(., specs$new_name)

OR

df %>% select(specs$old_name) %>% `colnames<-` (., specs$new_name)

CodePudding user response:

A base R option by matching the columns with the values from your dataframe like this:

# Select columns
df = df[,2:3]
# Convert to dataframe isntead of tibble
df = as.data.frame(df)
specs = as.data.frame(specs)

names(df)[match(specs[,"old_name"], names(df))] = specs[,"new_name"]
df
#>   region gdp_growth
#> 1    USA          8
#> 2    USA          9
#> 3    USA         11
#> 4    USA         10

Created on 2022-10-19 with reprex v2.0.2

  • Related