Home > Enterprise >  get non-null values from another data fram with variables of the same name
get non-null values from another data fram with variables of the same name

Time:07-15

I want to match the non-null values from Data_Frame_2

For example:

Data_Frame

      Training Pulse Duration
    1 Strength   100       60
    2  Stamina   150       30
    3    Other    NA       45

Data_Frame_2

       Training Pulse Duration
    1  Strength    NA       90
    2     Other    30       NA
    3 Dimension     2        1

Data_Frame %>% 
  full_join(Data_Frame_2, by= base::c("Training"="Training"))

       Training Pulse.x Duration.x Pulse.y Duration.y
    1  Strength     100         60      NA         90
    2   Stamina     150         30      NA         NA
    3     Other      NA         45      30         NA
    4 Dimension      NA         NA       2          1

Data_Frame %>% 
  full_join(Data_Frame_2, by= base::c("Training"="Training")) %>% 
  mutate(Pulse = if_else(is.na(Pulse.x), Pulse.y, Pulse.x),
         Duration = if_else(is.na(Duration.x), Duration.y, Duration.x)) %>% 
  select(Training, Pulse, Duration)

       Training Pulse Duration
    1  Strength   100       60
    2   Stamina   150       30
    3     Other    30       45
    4 Dimension     2        1


I would like to get the last data frame by avoiding the last step, that is, just with the join, but I have not been able to find another shorter way to solve it. I need to do it for two data frames that have the same variables but there are many variables, so the mutate would be enormously enlarged.

Thank you.

Created on 2022-07-14 by the reprex package (v2.0.1)

CodePudding user response:

A possible solution, based on power_full_join:

library(powerjoin)
library(dplyr)

power_full_join(df1, df2, by = "Training", conflict = coalesce)

#>    Training Pulse Duration
#> 1  Strength   100       60
#> 2   Stamina   150       30
#> 3     Other    30       45
#> 4 Dimension     2        1

CodePudding user response:

If you do not care about the ordering, you could do:

rows_patch(rows_upsert(df2, df1), df2)

       Training Pulse Duration
1      Strength   100       60
2         Other    30       45
3     Dimension     2        1
4       Stamina   150       30

Specify the columns used:

rows_patch(rows_upsert(df2, df1, 'Training'), df2, 'Training')

if you want to maintain the order:

bind_rows(df1,df2) %>%
  filter(!duplicated(Training)) %>%
  rows_patch(df2)

      Training Pulse Duration
1      Strength   100       60
2       Stamina   150       30
3         Other    30       45
4     Dimension     2        1
  • Related