I have the following dataframe df (dput
below):
> df
group seq id
1 A -5 NA
2 A -4 NA
3 A -3 NA
4 A -2 1
5 A -1 1
6 A 0 NA
7 A 1 NA
8 A 2 NA
9 A 3 NA
10 A 4 NA
11 A 5 NA
12 A -5 NA
13 A -4 NA
14 A -3 NA
15 A -2 NA
16 A -1 NA
17 A 0 NA
18 A 1 5
19 A 2 NA
20 A 3 NA
21 A 4 NA
22 A 5 NA
23 B -5 NA
24 B -4 8
25 B -3 8
26 B -2 8
27 B -1 NA
28 B 0 NA
29 B 1 NA
30 B 2 NA
31 B 3 NA
32 B 4 NA
33 B 5 NA
34 B -5 NA
35 B -4 NA
36 B -3 NA
37 B -2 NA
38 B -1 NA
39 B 0 NA
40 B 1 NA
41 B 2 NA
42 B 3 4
43 B 4 NA
44 B 5 NA
I would like to fill the id column with the existing values with the range of the seq column. The range in this case is between -5 and 5. So for the first sequence the NAs should be filled with 1 between -5 and 5 of seq and the next one with 5. Here is the desired output:
group seq id
1 A -5 1
2 A -4 1
3 A -3 1
4 A -2 1
5 A -1 1
6 A 0 1
7 A 1 1
8 A 2 1
9 A 3 1
10 A 4 1
11 A 5 1
12 A -5 5
13 A -4 5
14 A -3 5
15 A -2 5
16 A -1 5
17 A 0 5
18 A 1 5
19 A 2 5
20 A 3 5
21 A 4 5
22 A 5 5
23 B -5 8
24 B -4 8
25 B -3 8
26 B -2 8
27 B -1 8
28 B 0 8
29 B 1 8
30 B 2 8
31 B 3 8
32 B 4 8
33 B 5 8
34 B -5 4
35 B -4 4
36 B -3 4
37 B -2 4
38 B -1 4
39 B 0 4
40 B 1 4
41 B 2 4
42 B 3 4
43 B 4 4
44 B 5 4
As you see the first is filled with 1 and the second with 5 until the seq ranges. So I was wondering if anyone knows how to complete the ids with the available value with the sequence range?
dput
df:
df<-structure(list(group = c("A", "A", "A", "A", "A", "A", "A", "A",
"A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A",
"A", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B",
"B", "B", "B", "B", "B", "B", "B", "B", "B", "B"), seq = c(-5,
-4, -3, -2, -1, 0, 1, 2, 3, 4, 5, -5, -4, -3, -2, -1, 0, 1, 2,
3, 4, 5, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, -5, -4, -3, -2,
-1, 0, 1, 2, 3, 4, 5), id = c(NA, NA, NA, 1, 1, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, 5, NA, NA, NA, NA, NA, 8, 8,
8, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
4, NA, NA)), class = "data.frame", row.names = c(NA, -44L))
CodePudding user response:
library(dplyr)
#select unique group id entries in df
df2_left <- df[complete.cases(df),-2]%>%unique
#unique combination for group and seq
df2_right <- df %>%
select(group,seq) %>%
unique
merge(df2_left,df2_right,by="group")
group id seq
1 A 1 -5
2 A 1 -4
3 A 1 -3
4 A 1 -2
5 A 1 -1
6 A 1 0
7 A 1 1
8 A 1 2
9 A 1 3
10 A 1 4
11 A 1 5
12 A 5 -5
13 A 5 -4
14 A 5 -3
15 A 5 -2
16 A 5 -1
17 A 5 0
18 A 5 1
19 A 5 2
20 A 5 3
21 A 5 4
22 A 5 5
23 B 8 -5
24 B 8 -4
25 B 8 -3
26 B 8 -2
27 B 8 -1
28 B 8 0
29 B 8 1
30 B 8 2
31 B 8 3
32 B 8 4
33 B 8 5
34 B 4 -5
35 B 4 -4
36 B 4 -3
37 B 4 -2
38 B 4 -1
39 B 4 0
40 B 4 1
41 B 4 2
42 B 4 3
43 B 4 4
44 B 4 5