Home > Software design >  how to duplicate rows with certain condition and create anew variable at the same time
how to duplicate rows with certain condition and create anew variable at the same time

Time:03-22

I have a df like below and I would like to transfer it to sth like the table on the right, how can I duplicate the rows with Type=="N" and add new var Grade?

Basically, if Type==N, then Grade can be S or W, that is why we need to duplicate the rows.

enter image description here

df<-structure(list(Type = c("N", "N", "S", "W"), Result = c(8, 9, 
7, 6)), row.names = c(NA, -4L), class = c("tbl_df", "tbl", "data.frame"
))

CodePudding user response:

Using some functions from tidyverse, you can use crossing to duplicate rows and add the "Grade" column at the same time, then filter to match your stated rules.

library(tidyverse)

result <- df %>% 
  crossing(data.frame(Grade = c('S', 'W'))) %>% 
  filter(Type == 'N' | Type == Grade)

  Type  Result Grade
  <chr>  <dbl> <chr>
1 N          8 S    
2 N          8 W    
3 N          9 S    
4 N          9 W    
5 S          7 S    
6 W          6 W    

CodePudding user response:

I think this approach is extensible to many more conditions assuming yours is the minimal example and you have a larger more complicated dataset.

library(dplyr)

df<-structure(list(Type = c("N", "N", "S", "W"), Result = c(8, 9, 
7, 6)), row.names = c(NA, -4L), class = c("tbl_df", "tbl", "data.frame"
))

df2 <- data.frame(Type2 = c("N", "N"), Grade = c("S", "W"))

df %>%
    select(Type, Result) %>%
    left_join(df2, by = c("Type" = "Type2")) %>%
    mutate(Grade = case_when(Type == "S" ~ "S", Type == "W" ~ "W", TRUE ~ Grade))
  Type  Result Grade
  <chr>  <dbl> <chr>
1 N          8 S    
2 N          8 W    
3 N          9 S    
4 N          9 W    
5 S          7 S    
6 W          6 W   

CodePudding user response:

A dplyr way: We could use bind_rows after using slice.

library(dplyr)

df %>% 
  slice(1:2) %>% 
  bind_rows(df) %>% 
  group_by(Type) %>% 
  arrange(Result, .by_group = TRUE) %>% 
  ungroup() %>% 
  mutate(Grade = rep(c("S","W"),length.out = n()), .before=2)
   Type  Grade Result
  <chr> <chr>  <dbl>
1 N     S          8
2 N     W          8
3 N     S          9
4 N     W          9
5 S     S          7
6 W     W          6

CodePudding user response:

Here is a possible data.table option:

library(data.table)
dt <- as.data.table(df)

output <- dt[, CJ(.SD$Type, c('S', 'W')), .(Result)][which(V1 == 'N' | V1 == V2), ]
setnames(output, c(names(dt), "Grade"))
setcolorder(output, c("Result", "Grade", "Type"))

Output

   Result Grade Type
1:      N     S    8
2:      N     W    8
3:      N     S    9
4:      N     W    9
5:      S     S    7
6:      W     W    6

CodePudding user response:

Another option is to use if_else() (or case_when() if there are more complex conditions) to return a list column of multiple values and unnest:

library(dplyr)
library(tidyr)

df %>%
  mutate(Grade = if_else(Type == "N", list(c("S", "W")), as.list(Type))) %>%
  unnest(Grade)

# A tibble: 6 x 3
  Type  Result Grade
  <chr>  <dbl> <chr>
1 N          8 S    
2 N          8 W    
3 N          9 S    
4 N          9 W    
5 S          7 S    
6 W          6 W  

Or:

df %>%
  mutate(Grade = case_when(Type == "N" ~ list(c("S", "W")),
                           TRUE ~ as.list(Type))) %>%
  unnest(Grade)
  •  Tags:  
  • r
  • Related