Home > database >  Replacing column with another data frame based on name matching
Replacing column with another data frame based on name matching

Time:04-25

Hi I am a bit new so I am not sure if I am doing this right, but I looked around on the overflow and couldn't find a code or advice that worked with my code.

I have a dataframe mainDF that looks like this:

Person ABG SEP CLC XSP APP WED GSH
SP-1 2.1 3.0 1.3 1.8 1.4 2.5 1.4
SP-2 2.5 2.1 2.0 1.9 1.2 1.2 2.1
SP-3 2.3 3.1 2.5 1.5 1.1 2.6 2.1

I have another dataframe, TranslateDF that has the converting info for the abbreviated column names. And I want to replace the abbreviated names with the real names here:

Do note that the translating data frame may have extraneous info or it could be missing info for the column, and so if the mainDF does not get the full naming, for it to be dropped from the data.

Abbreviated Full Naming
ABG All barbecue grill
SEP shake eel peel
CLC cold loin cake
XSP xylophone spear pint
APP apple pot pie
HUM hall united meat
LPL lending porkloin

Ideally, I would get the new resulted dataframe as:

Person All barbecue grill shake eel peel cold loin cake xylophone spear pint apple pot pie
SP-1 2.1 3.0 1.3 1.8 1.4
SP-2 2.5 2.1 2.0 1.9 1.2
SP-3 2.3 3.1 2.5 1.5 1.1

I would appreciate any help on this thank you!

CodePudding user response:

You can pass a named vector to select() which will rename and select in one step. Wrapping with any_of() ensures it won't fail if any columns don't exist in the main data frame:

library(dplyr)

df1 %>%
  select(Person, any_of(setNames(df2$Abbreviated, df2$Full_Naming))) 

# A tibble: 3 x 6
  Person `All barbecue grill` `shake eel peel` `cold loin cake` `xylophone spear pint` `apple pot pie`
  <chr>                 <dbl>            <dbl>            <dbl>                  <dbl>           <dbl>
1 SP-1                    2.1              3                1.3                    1.8             1.4
2 SP-2                    2.5              2.1              2                      1.9             1.2
3 SP-3                    2.3              3.1              2.5                    1.5             1.1

Data:

df1 <- structure(list(Person = c("SP-1", "SP-2", "SP-3"), ABG = c(2.1, 
2.5, 2.3), SEP = c(3, 2.1, 3.1), CLC = c(1.3, 2, 2.5), XSP = c(1.8, 
1.9, 1.5), APP = c(1.4, 1.2, 1.1), WED = c(2.5, 1.2, 2.6), GSH = c(1.4, 
2.1, 2.1)), class = c("spec_tbl_df", "tbl_df", "tbl", "data.frame"
), row.names = c(NA, -3L), spec = structure(list(cols = list(
    Person = structure(list(), class = c("collector_character", 
    "collector")), ABG = structure(list(), class = c("collector_double", 
    "collector")), SEP = structure(list(), class = c("collector_double", 
    "collector")), CLC = structure(list(), class = c("collector_double", 
    "collector")), XSP = structure(list(), class = c("collector_double", 
    "collector")), APP = structure(list(), class = c("collector_double", 
    "collector")), WED = structure(list(), class = c("collector_double", 
    "collector")), GSH = structure(list(), class = c("collector_double", 
    "collector"))), default = structure(list(), class = c("collector_guess", 
"collector")), skip = 1L), class = "col_spec"))

df2 <- structure(list(Abbreviated = c("ABG", "SEP", "CLC", "XSP", "APP", 
"HUM", "LPL"), Full_Naming = c("All barbecue grill", "shake eel peel", 
"cold loin cake", "xylophone spear pint", "apple pot pie", "hall united meat", 
"lending porkloin")), class = "data.frame", row.names = c(NA, 
-7L))

CodePudding user response:

How about this:

mainDF <- structure(list(Person = c("SP-1", "SP-2", "SP-3"), ABG = c(2.1, 
2.5, 2.3), SEP = c(3, 2.1, 3.1), CLC = c(1.3, 2, 2.5), XSP = c(1.8, 
1.9, 1.5), APP = c(1.4, 1.2, 1.1), WED = c(2.5, 1.2, 2.6), GSH = c(1.4, 
2.1, 2.1)), row.names = c(NA, 3L), class = "data.frame")

translateDF <- structure(list(Abbreviated = c("ABG", "SEP", "CLC", "XSP", "APP", 
"HUM", "LPL"), `Full Naming` = c("All barbecue grill", "shake eel peel", 
"cold loin cake", "xylophone spear pint", "apple pot pie", "hall united meat", 
"lending porkloin")), row.names = c(NA, 7L), class = "data.frame")

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tidyr)

mainDF %>% 
  pivot_longer(-Person, 
               names_to="Abbreviated", 
               values_to = "vals") %>% 
  left_join(translateDF) %>% 
  select(-Abbreviated) %>% 
  na.omit() %>% 
  pivot_wider(names_from=`Full Naming`, values_from="vals")
#> Joining, by = "Abbreviated"
#> # A tibble: 3 × 6
#>   Person `All barbecue grill` `shake eel peel` `cold loin cake` `xylophone spe…`
#>   <chr>                 <dbl>            <dbl>            <dbl>            <dbl>
#> 1 SP-1                    2.1              3                1.3              1.8
#> 2 SP-2                    2.5              2.1              2                1.9
#> 3 SP-3                    2.3              3.1              2.5              1.5
#> # … with 1 more variable: `apple pot pie` <dbl>

Created on 2022-04-24 by the reprex package (v2.0.1)

CodePudding user response:

library(tidyverse)

mainDF %>%
  rename_with(~str_replace_all(., set_names(TranslateDF[, 2], TranslateDF[, 1]))) %>%
  select(Person, which(!(names(.) %in% names(mainDF))))

  Person All barbecue grill shake eel peel cold loin cake xylophone spear pint apple pot pie
1   SP-1                2.1            3.0            1.3                  1.8           1.4
2   SP-2                2.5            2.1            2.0                  1.9           1.2
3   SP-3                2.3            3.1            2.5                  1.5           1.1
  • Related