Home > Back-end >  Replace a value in df based on another df in R
Replace a value in df based on another df in R

Time:10-07

I'm using R. I have 2 dataframes:

column_a <- c("20", "30", "40", "60")
column_b <- c("1.2", "1.3", "1.4", "1.5")
column_c <- c("30", "50", "60", "70")
column_d <- c("2.3", "4.5", "6.7", "8.9")

df1 <- data.frame(column_a, column_b)
df2 <- data.frame (column_c, column_d)

What I need to do is this: if a value in column_a is present in column_c, the corresponding value in column_b must replace the value in column_d.

The new df2 should look like this:

  column_c column_d
1       30      1.3
2       50      4.5
3       60      1.5
4       70      8.9

As you can see, the values "2.3" and "6.7" have been replaced by "1.3" and "1.5", respectively, as "30" and "60" are the only values that are present both in the column_c of df_2 and in column_a of df_1.

CodePudding user response:

You may join the dataframe and replace the values with coalesce.

library(dplyr)

df2 %>%
  left_join(df1, by = c('column_c' = 'column_a')) %>%
  transmute(column_c, column_d = coalesce(column_b, column_d))

#  column_c column_d
#1       30      1.3
#2       50      4.5
#3       60      1.5
#4       70      8.9

In base R -

transform(merge(df2, df1, all.x = TRUE, by.x = 'column_c', by.y = 'column_a'), 
          column_d = ifelse(is.na(column_b), column_d, column_b))

CodePudding user response:

library(tidyverse)

column_a <- c("20", "30", "40", "60")
column_b <- c("1.2", "1.3", "1.4", "1.5")
column_c <- c("30", "50", "60", "70")
column_d <- c("2.3", "4.5", "6.7", "8.9")

df1 <- data.frame(column_a, column_b)
df2 <- data.frame(column_c, column_d)

df1
#>   column_a column_b
#> 1       20      1.2
#> 2       30      1.3
#> 3       40      1.4
#> 4       60      1.5
df2
#>   column_c column_d
#> 1       30      2.3
#> 2       50      4.5
#> 3       60      6.7
#> 4       70      8.9

bind_cols(df1, df2) %>%
  mutate(
    column_b = ifelse(column_a == column_c, column_d, column_b)
  ) %>%
  select(column_c, column_d)
#>   column_c column_d
#> 1       30      2.3
#> 2       50      4.5
#> 3       60      6.7
#> 4       70      8.9

Created on 2021-10-07 by the reprex package (v2.0.1)

CodePudding user response:

With basic R ifelse command:

df2$column_d = apply(df2, 1, function(x) {ifelse(x['column_c'] %in% df1$column_a,
               df1[df1$column_a == x['column_c'], 'column_b'], x['column_d'])})
  • Related