Home > Software engineering >  Duplicating rows conditionally in R
Duplicating rows conditionally in R

Time:04-22

I would like to duplicate each observation based on the count. For example:

  • If count == 3, duplicate the observation three times but replacing the count with 1 each time.
  • If count == 1, no changes are required.
# Sample data
df <- tibble(
x = c("A", "C", "C", "B", "C", "A", "A"),
y = c("Y", "N", "Y", "N", "N", "N", "Y"),
count = c(1, 1, 3, 2, 1, 1, 1)
)

# Target output
df <- tibble(
x = c("A", "C", "C", "C", "C", "B", "B", "C", "A", "A"),
y = c("Y", "N", "Y", "Y", "Y", "N", "N", "N", "N", "Y"),
count = (1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
)

CodePudding user response:

Using dplyr and tidyr,

df %>% uncount(count, .remove = F) %>% 
  mutate(count = ifelse(count==3,1, count))

The output is

   x     y     count
   <chr> <chr> <dbl>
 1 A     Y         1
 2 C     N         1
 3 C     Y         1
 4 C     Y         1
 5 C     Y         1
 6 B     N         2
 7 B     N         2
 8 C     N         1
 9 A     N         1
10 A     Y         1
  • Related