I have this dataframe:
df <- data.frame(
ID = 1:5,
Subject = c("A","A","B","B","C"),
Duration = c(3,2,2,4,5)
)
The task is straightforward: I need to increase the number of rows by the vector in column Duration
. That is, for example, Duration
in row #1 is 3; so this row should be triplicated. Duration in row #2 is 2; so this row should be duplicated, and so on. How can this be done?
Expected:
ID Subject Duration
1 1 A 3
2 1 A 3
3 1 A 3
4 2 A 2
5 2 A 2
6 3 B 2
7 3 B 2
8 4 B 4
9 4 B 4
10 4 B 4
11 4 B 4
12 5 C 5
13 5 C 5
14 5 C 5
15 5 C 5
16 5 C 5
I'm grateful for any solution, particularly for a dplyr
one.
CodePudding user response:
We could use slice
:
library(dplyr)
df %>%
slice(rep(row_number(), Duration))
ID Subject Duration
1 1 A 3
2 1 A 3
3 1 A 3
4 2 A 2
5 2 A 2
6 3 B 2
7 3 B 2
8 4 B 4
9 4 B 4
10 4 B 4
11 4 B 4
12 5 C 5
13 5 C 5
14 5 C 5
15 5 C 5
16 5 C 5
CodePudding user response:
The function you need is tidyr::uncount
.
library(tidyr)
uncount(df, Duration, .remove = F)
ID Subject Duration
1 1 A 3
2 1 A 3
3 1 A 3
4 2 A 2
5 2 A 2
6 3 B 2
7 3 B 2
8 4 B 4
9 4 B 4
10 4 B 4
11 4 B 4
12 5 C 5
13 5 C 5
14 5 C 5
15 5 C 5
16 5 C 5