Replacing column with another data frame based on name matching-CodePudding

Hi I am a bit new so I am not sure if I am doing this right, but I looked around on the overflow and couldn't find a code or advice that worked with my code.

I have a dataframe mainDF that looks like this:

Person	ABG	SEP	CLC	XSP	APP	WED	GSH
SP-1	2.1	3.0	1.3	1.8	1.4	2.5	1.4
SP-2	2.5	2.1	2.0	1.9	1.2	1.2	2.1
SP-3	2.3	3.1	2.5	1.5	1.1	2.6	2.1

I have another dataframe, TranslateDF that has the converting info for the abbreviated column names. And I want to replace the abbreviated names with the real names here:

Do note that the translating data frame may have extraneous info or it could be missing info for the column, and so if the mainDF does not get the full naming, for it to be dropped from the data.

Abbreviated	Full Naming
ABG	All barbecue grill
SEP	shake eel peel
CLC	cold loin cake
XSP	xylophone spear pint
APP	apple pot pie
HUM	hall united meat
LPL	lending porkloin

Ideally, I would get the new resulted dataframe as:

Person	All barbecue grill	shake eel peel	cold loin cake	xylophone spear pint	apple pot pie
SP-1	2.1	3.0	1.3	1.8	1.4
SP-2	2.5	2.1	2.0	1.9	1.2
SP-3	2.3	3.1	2.5	1.5	1.1

I would appreciate any help on this thank you!

CodePudding user response：

You can pass a named vector to select() which will rename and select in one step. Wrapping with any_of() ensures it won't fail if any columns don't exist in the main data frame:

library(dplyr)

df1 %>%
  select(Person, any_of(setNames(df2$Abbreviated, df2$Full_Naming))) 

# A tibble: 3 x 6
  Person `All barbecue grill` `shake eel peel` `cold loin cake` `xylophone spear pint` `apple pot pie`
  <chr>                 <dbl>            <dbl>            <dbl>                  <dbl>           <dbl>
1 SP-1                    2.1              3                1.3                    1.8             1.4
2 SP-2                    2.5              2.1              2                      1.9             1.2
3 SP-3                    2.3              3.1              2.5                    1.5             1.1

Data:

df1 <- structure(list(Person = c("SP-1", "SP-2", "SP-3"), ABG = c(2.1, 
2.5, 2.3), SEP = c(3, 2.1, 3.1), CLC = c(1.3, 2, 2.5), XSP = c(1.8, 
1.9, 1.5), APP = c(1.4, 1.2, 1.1), WED = c(2.5, 1.2, 2.6), GSH = c(1.4, 
2.1, 2.1)), class = c("spec_tbl_df", "tbl_df", "tbl", "data.frame"
), row.names = c(NA, -3L), spec = structure(list(cols = list(
    Person = structure(list(), class = c("collector_character", 
    "collector")), ABG = structure(list(), class = c("collector_double", 
    "collector")), SEP = structure(list(), class = c("collector_double", 
    "collector")), CLC = structure(list(), class = c("collector_double", 
    "collector")), XSP = structure(list(), class = c("collector_double", 
    "collector")), APP = structure(list(), class = c("collector_double", 
    "collector")), WED = structure(list(), class = c("collector_double", 
    "collector")), GSH = structure(list(), class = c("collector_double", 
    "collector"))), default = structure(list(), class = c("collector_guess", 
"collector")), skip = 1L), class = "col_spec"))

df2 <- structure(list(Abbreviated = c("ABG", "SEP", "CLC", "XSP", "APP", 
"HUM", "LPL"), Full_Naming = c("All barbecue grill", "shake eel peel", 
"cold loin cake", "xylophone spear pint", "apple pot pie", "hall united meat", 
"lending porkloin")), class = "data.frame", row.names = c(NA, 
-7L))

CodePudding user response：

How about this:

mainDF <- structure(list(Person = c("SP-1", "SP-2", "SP-3"), ABG = c(2.1, 
2.5, 2.3), SEP = c(3, 2.1, 3.1), CLC = c(1.3, 2, 2.5), XSP = c(1.8, 
1.9, 1.5), APP = c(1.4, 1.2, 1.1), WED = c(2.5, 1.2, 2.6), GSH = c(1.4, 
2.1, 2.1)), row.names = c(NA, 3L), class = "data.frame")

translateDF <- structure(list(Abbreviated = c("ABG", "SEP", "CLC", "XSP", "APP", 
"HUM", "LPL"), `Full Naming` = c("All barbecue grill", "shake eel peel", 
"cold loin cake", "xylophone spear pint", "apple pot pie", "hall united meat", 
"lending porkloin")), row.names = c(NA, 7L), class = "data.frame")

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tidyr)

mainDF %>% 
  pivot_longer(-Person, 
               names_to="Abbreviated", 
               values_to = "vals") %>% 
  left_join(translateDF) %>% 
  select(-Abbreviated) %>% 
  na.omit() %>% 
  pivot_wider(names_from=`Full Naming`, values_from="vals")
#> Joining, by = "Abbreviated"
#> # A tibble: 3 × 6
#>   Person `All barbecue grill` `shake eel peel` `cold loin cake` `xylophone spe…`
#>   <chr>                 <dbl>            <dbl>            <dbl>            <dbl>
#> 1 SP-1                    2.1              3                1.3              1.8
#> 2 SP-2                    2.5              2.1              2                1.9
#> 3 SP-3                    2.3              3.1              2.5              1.5
#> # … with 1 more variable: `apple pot pie` <dbl>

^{Created on 2022-04-24 by the reprex package (v2.0.1)}

CodePudding user response：

library(tidyverse)

mainDF %>%
  rename_with(~str_replace_all(., set_names(TranslateDF[, 2], TranslateDF[, 1]))) %>%
  select(Person, which(!(names(.) %in% names(mainDF))))

  Person All barbecue grill shake eel peel cold loin cake xylophone spear pint apple pot pie
1   SP-1                2.1            3.0            1.3                  1.8           1.4
2   SP-2                2.5            2.1            2.0                  1.9           1.2
3   SP-3                2.3            3.1            2.5                  1.5           1.1