I have created the following dataframe in R
library(tidyR)
library(dplyr)
DF11<- data.frame("ID"= c("A", "A", "A", "B", "B", "B", "B", "B"))
DF11$X_F<-c(5, 7,9,6,7,8,9,10)
DF11$X_A<-c(7, 8,9,3,6,7,9,10)
The dataframe looks as follows
ID X_F X_A
A 5 7
A 7 8
A 9 9
B 6 3
B 7 6
B 8 7
B 9 9
B 10 10
ID is the grouping variable. I would like to use dplyr to create the following dataframe.
ID X_F X_A
A 0 NA
A 1 NA
A 2 NA
A 3 NA
A 4 NA
A 5 7
A 7 8
A 9 9
A 10 NA
A 11 NA
A 12 NA
B 0 NA
B 1 NA
B 2 NA
B 3 NA
B 4 NA
B 5 NA
B 6 3
B 7 6
B 8 7
B 9 9
B 10 10
B 11 NA
B 12 NA
B 13 NA
The resultant dataframe should take DF11 and then group the X_F column using ID column. Next it should complete X_F group-wise from 0 to the minimum value of X_F by group, and then from the maximum value of X_F to maximum value X_F 3.
I tried the following code and was able to solve it partially.
DF112<-DF11%>%group_by(ID)%>%complete(X_F=seq(0, max(X_F) 3, by =1))
ID X_F X_A
A 0 NA
A 1 NA
A 2 NA
A 3 NA
A 4 NA
A 5 7
A 6 NA
A 7 8
A 8 NA
A 9 9
A 10 NA
A 11 NA
A 12 NA
B 0 NA
B 1 NA
B 2 NA
B 3 NA
B 4 NA
B 5 NA
B 6 3
B 7 6
B 8 7
B 9 9
B 10 10
B 11 NA
B 12 NA
B 13 NA
How do I get the desired output mentioned above. I request someone to guide me.
CodePudding user response:
It would work to pass two vectors into your complete function call, one to do the lower values and one to do the upper:
library(tidyr)
library(dplyr)
DF11 <- data.frame("ID" = c("A", "A", "A", "B", "B", "B", "B", "B"))
DF11$X_F <- c(5, 7, 9, 6, 7, 8, 9, 10)
DF11$X_A <- c(7, 8, 9, 3, 6, 7, 9, 10)
DF11 %>%
group_by(ID) %>%
complete(X_F = c(seq(0, min(X_F) - 1 , by = 1), seq(max(X_F) 1, max(X_F) 3, by = 1))) |>
arrange(ID, X_F)
# A tibble: 25 × 3
# Groups: ID [2]
ID X_F X_A
<chr> <dbl> <dbl>
1 A 0 NA
2 A 1 NA
3 A 2 NA
4 A 3 NA
5 A 4 NA
6 A 5 7
7 A 7 8
8 A 9 9
9 A 10 NA
10 A 11 NA
11 A 12 NA
12 B 0 NA
13 B 1 NA
14 B 2 NA
15 B 3 NA
16 B 4 NA
17 B 5 NA
18 B 6 3
19 B 7 6
20 B 8 7
21 B 9 9
22 B 10 10
23 B 11 NA
24 B 12 NA
25 B 13 NA
Created on 2022-11-01 with reprex v2.0.2