Home > database >  R function rows_upsert without NAs
R function rows_upsert without NAs

Time:02-17

I recently discovered "rows_upsert" in R's dplyr. If dt1 and dt2 are tibbles, I want to bring in new rows and updated cells from dt2, but I don't want to overwrite a meaningful value from dt1 with an "NA" value from dt2 for a matching row. Ideas?

key<-c("a", "b", "c")
values<-c(1, 2, 3)
dt1<-as_tibble(cbind(key, values))

key2<-c("a", "b", "c", "d")
values2<-c(1, NA, 4, 5)
dt2<-as_tibble(cbind(key = key2, values=values2))

dt1
dt2
rows_upsert(dt1, dt2, by="key")

The goal:

key values
a 1
b 2
c 4
d 5

CodePudding user response:

There are several ways to do this:

require(dplyr)

dt1 <- tibble(key = c("a", "b", "c"),
                  values = c(1, 2, 3))
dt2 <- tibble(key = c("a", "b", "c", "d"),
                  values = c(1, NA, 4, 5))

## Using dplyr::filter
rows_upsert(dt1, dt2 |> filter(!is.na(values)), by = "key")
#> # A tibble: 4 x 2
#>   key   values
#>   <chr>  <dbl>
#> 1 a          1
#> 2 b          2
#> 3 c          4
#> 4 d          5

## or as Ritchie has suggested in comments:
rows_upsert(dt1, na.omit(dt2), by = "key")
#> # A tibble: 4 x 2
#>   key   values
#>   <chr>  <dbl>
#> 1 a          1
#> 2 b          2
#> 3 c          4
#> 4 d          5

## or piping rows_upsert output into rows_patch:
rows_upsert(dt1, dt2, by = "key") |> rows_patch(dt1, by = "key")
#> # A tibble: 4 x 2
#>   key   values
#>   <chr>  <dbl>
#> 1 a          1
#> 2 b          2
#> 3 c          4
#> 4 d          5

Created on 2022-02-16 by the reprex package (v2.0.1)

  • Related