Home > Back-end >  adding flags from one data frame to another in R
adding flags from one data frame to another in R

Time:02-23

Suppose you have two Tables: Table A and Table B

Table A has the following columns:

ID      Flag   Color     SpecialVar
M112    X      Green     2344
M134           Blue      1923
M542    X      Black     4452
M698           Purple    8857

Table B has the following columns:

ID      Flag   Color
M112           Green
M134           Blue
M784           Orange
M542           Black

For every ID in Table B, I want to add the "X" under the flag column, if the flag exists for that ID in Table A.

so the result would look like this, NEW TABLE B:

ID      Flag   Color
M112    X       Green
M134            Yellow
M784            Orange
M542    X       Black

Providing the data frames here:

 TableA <- data.frame(ID = c("M112", "M134", "M542", "M698"), Flag = c("X", "", "X", ""), Color = c("Green", "Blue", "Black", "Purple"), SpecialVar = c(2344, 1923, 4452, 8857))
 TableB <- data.frame(ID = c("M112", "M134", "M784", "M542"), Flag = c("", "", "", ""), Color = c("Green", "Blue", "Orange", "Black"))

CodePudding user response:

You can do it as follows using match from base R.

TableC <- TableB
TableC$Flag <- TableA$Flag[match(TableC$ID, TableA$ID)]
TableC

Output:

    ID Flag  Color
1 M112    X  Green
2 M134        Blue
3 M784 <NA> Orange
4 M542    X  Black

You can get rid of <NA> by using na.omit(TableC).

CodePudding user response:

Please find below one possible solution using data.table

Reprex

  • Code
library(data.table)

setDT(TableB)[chmatch(TableA$ID, TableB$ID), Flag := TableA$Flag][]
  • Output
#>      ID Flag  Color
#> 1: M112    X  Green
#> 2: M134        Blue
#> 3: M784      Orange
#> 4: M542    X  Black

Created on 2022-02-23 by the reprex package (v2.0.1)

CodePudding user response:

If flag is empty in Tableb you can also drop it and use a left join:

TableB  %>%  
select(-Flag) %>%  
left_join(TableA, by=c("ID","Color"))

CodePudding user response:

If TableB's Flag column is empty, then you could join the two tables, removing the "Flag" column in TableB first:

TableB %>% 
  dplyr::select(-Flag) %>% 
  dplyr::left_join(TableA, ., by = c("ID", "Color")) 

Giving:

    ID Flag  Color SpecialVar
1 M112    X  Green       2344
2 M134        Blue       1923
3 M542    X  Black       4452
4 M698      Purple       8857

If the Flag market is distributed across the two columns, and you need to combine them, then join will work with some additional steps:

TableC <- data.frame(ID = c("M112", "M134", "M542", "M698"), Flag = c("", "", "X", ""), Color = c("Green", "Blue", "Black", "Purple"), SpecialVar = c(2344, 1923, 4452, 8857))
TableD <- data.frame(ID = c("M112", "M134", "M784", "M542"), Flag = c("X", "", "", ""), Color = c("Green", "Blue", "Orange", "Black"))

TableD %>% 
  dplyr::left_join(TableC, ., by = c("ID", "Color")) %>% 
  dplyr::mutate(Flag = ifelse(Flag.x == "X" | Flag.y == "X", "X", "")) %>% 
  dplyr::select(-c(Flag.x, Flag.y)) %>% 
  tidyr::replace_na(list(Flag = ""))

 > TableC
    ID Flag  Color SpecialVar
1 M112       Green       2344
2 M134        Blue       1923
3 M542    X  Black       4452
4 M698      Purple       8857
 > TableD
    ID Flag  Color
1 M112    X  Green
2 M134        Blue
3 M784      Orange
4 M542       Black

Giving:

    ID  Color SpecialVar Flag
1 M112  Green       2344    X
2 M134   Blue       1923     
3 M542  Black       4452    X
4 M698 Purple       8857     
  • Related