Home > database >  Create new column in dataframe based on another and matching to another dataset in R
Create new column in dataframe based on another and matching to another dataset in R

Time:07-26

I've the following data frame and I would like to create a new column based on the following conditions:

  1. If Price.of.Books is BLANK, R will look up to the matching data frame based on the value in Count.of.Books..Match.
  2. If Price.of.Books is not BLANK, it will retain the value

In Excel, it can be done with ifelse() with a vlookup() within the ifelse() function. Is there something similar that can be done in R? So that it will work for larger datasets?

Dataframe

Index Grouping Count.of.Books..Match. Price.of.Books
i1 A 1 BLANK
i2 A 2 BLANK
i3 B 2 12
i4 B 6 BLANK
i5 C 4 10
i6 C 1 11.5
i7 D 3 8.5
i8 D 6 BLANK
i9 E 4 BLANK
df = structure(list(Index = structure(1:9, .Label = c("I1", "I2", 
                    "I3", "I4", "I5", "I6", "I7", "I8", "I9"), class = "factor"), 
                 Grouping = structure(c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L
                 ), .Label = c("A", "B", "C", "D", "E"), class = "factor"), 
                 Count.of.Books..Match. = c(1L, 2L, 2L, 6L, 4L, 1L, 3L, 6L, 
                  4L), Price.of.Books = structure(c(5L, 5L, 3L, 5L, 1L, 2L, 
                   4L, 5L, 5L), .Label = c("10", "11.5", "12", "8.5", "BLANK"
                   ), class = "factor")), row.names = c(NA, -9L), class = "data.frame")

Matching Data Frame

Match Price.of.Books
1 6
2 7
3 8
4 9.5
5 12
6 13
match = structure(list(Match = 1:6, Price.of.Books = c(6, 7, 8, 9.5, 12, 13)), class = "data.frame", row.names = c(NA, -6L))

The outcome should be the following:

Index Grouping Count.of.Books..Match. Price.of.Books New Column
i1 A 1 BLANK 6
i2 A 2 BLANK 7
i3 B 2 12 12
i4 B 6 BLANK 13
i5 C 4 10 10
i6 C 1 11.5 11.5
i7 D 3 8.5 8.5
i8 D 6 BLANK 13
i9 E 4 BLANK 9.5

Thanks in advance!

CodePudding user response:

A possible solution:

library(dplyr)

df %>% 
  inner_join(match, by = c("Count.of.Books..Match." = "Match")) %>% 
  mutate(Price.of.Books = Price.of.Books.x, 
    new = ifelse(Price.of.Books == "BLANK", Price.of.Books.y, Price.of.Books.x),
    Price.of.Books.x = NULL, Price.of.Books.y = NULL) 

#>   Index Grouping Count.of.Books..Match. Price.of.Books  new
#> 1    I1        A                      1          BLANK  6.0
#> 2    I2        A                      2          BLANK  7.0
#> 3    I3        B                      2             12  3.0
#> 4    I4        B                      6          BLANK 13.0
#> 5    I5        C                      4             10  1.0
#> 6    I6        C                      1           11.5  2.0
#> 7    I7        D                      3            8.5  4.0
#> 8    I8        D                      6          BLANK 13.0
#> 9    I9        E                      4          BLANK  9.5
  • Related