Home > other >  Adding rows in R based on columns values
Adding rows in R based on columns values

Time:10-26

I'm struggling with the following task. Here is a dummy example of the dataset I'm working with (dput output):

structure(list(SHOPID = c("11T", "2T1", "33A", "14B", "29O"), PPCODE = c(4412, 
                      3567, 4412, 3567, 5123), PRICE = c(1.45, 1.21, 1.45, 1.21, 1.01
                       ), TYPE = c("LAR", "SMA", "SMA", "SMA", "LAR"
                       )), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"
                       ))

The task is to write such a script that if some ppcode is in the type SMA (small), but not in LAR (large) additional row should be added where the data related to that ppcode will be the same, but with type "LAR". So, in this provided example the line where SHOPID=NA, PPCODE=3567, PRICE= 1.21, TYPE=LAR should be added at the bottom. This should work not only for this particular case but and in general on the bigger dataset. I prefer dplyr library and tried with it, but other solutions are also welcome.

CodePudding user response:

Since you only specify SMA to LAR duplication, it's difficult to use tidyr::complete. Here's an alternative:

library(dplyr)
dat %>%
  group_by(PPCODE, PRICE) %>%
  filter(!"LAR" %in% TYPE) %>%
  ungroup() %>%
  mutate(SHOPID = NA, TYPE = "LAR") %>%
  distinct() %>%
  bind_rows(dat, .)
# # A tibble: 6 x 4
#   SHOPID PPCODE PRICE TYPE 
#   <chr>   <dbl> <dbl> <chr>
# 1 11T      4412  1.45 LAR  
# 2 2T1      3567  1.21 SMA  
# 3 33A      4412  1.45 SMA  
# 4 14B      3567  1.21 SMA  
# 5 29O      5123  1.01 LAR  
# 6 <NA>     3567  1.21 LAR  
  • Related