I recently discovered "rows_upsert" in R's dplyr. If dt1 and dt2 are tibbles, I want to bring in new rows and updated cells from dt2, but I don't want to overwrite a meaningful value from dt1 with an "NA" value from dt2 for a matching row. Ideas?
key<-c("a", "b", "c")
values<-c(1, 2, 3)
dt1<-as_tibble(cbind(key, values))
key2<-c("a", "b", "c", "d")
values2<-c(1, NA, 4, 5)
dt2<-as_tibble(cbind(key = key2, values=values2))
dt1
dt2
rows_upsert(dt1, dt2, by="key")
The goal:
key | values |
---|---|
a | 1 |
b | 2 |
c | 4 |
d | 5 |
CodePudding user response:
There are several ways to do this:
require(dplyr)
dt1 <- tibble(key = c("a", "b", "c"),
values = c(1, 2, 3))
dt2 <- tibble(key = c("a", "b", "c", "d"),
values = c(1, NA, 4, 5))
## Using dplyr::filter
rows_upsert(dt1, dt2 |> filter(!is.na(values)), by = "key")
#> # A tibble: 4 x 2
#> key values
#> <chr> <dbl>
#> 1 a 1
#> 2 b 2
#> 3 c 4
#> 4 d 5
## or as Ritchie has suggested in comments:
rows_upsert(dt1, na.omit(dt2), by = "key")
#> # A tibble: 4 x 2
#> key values
#> <chr> <dbl>
#> 1 a 1
#> 2 b 2
#> 3 c 4
#> 4 d 5
## or piping rows_upsert output into rows_patch:
rows_upsert(dt1, dt2, by = "key") |> rows_patch(dt1, by = "key")
#> # A tibble: 4 x 2
#> key values
#> <chr> <dbl>
#> 1 a 1
#> 2 b 2
#> 3 c 4
#> 4 d 5
Created on 2022-02-16 by the reprex package (v2.0.1)