Home > Mobile >  Complete a dataframe in R By ID upto selected values of the dataframe only
Complete a dataframe in R By ID upto selected values of the dataframe only

Time:11-01

I have created the following dataframe in R

 library(tidyR)
 library(dplyr)
 DF11<- data.frame("ID"= c("A", "A", "A", "B", "B", "B", "B", "B"))
 DF11$X_F<-c(5, 7,9,6,7,8,9,10)
 DF11$X_A<-c(7, 8,9,3,6,7,9,10)

The dataframe looks as follows

  ID X_F X_A
   A   5   7
   A   7   8
   A   9   9
   B   6   3
   B   7   6
   B   8   7
   B   9   9
   B  10  10

ID is the grouping variable. I would like to use dplyr to create the following dataframe.

  ID X_F X_A
  A   0  NA
  A   1  NA
  A   2  NA
  A   3  NA
  A   4  NA
  A   5   7
  A   7   8
  A   9   9
  A  10  NA
  A  11  NA
  A  12  NA
  B   0  NA
  B   1  NA
  B   2  NA
  B   3  NA
  B   4  NA
  B   5  NA
  B   6   3
  B   7   6
  B   8   7
  B   9   9
  B  10  10
  B  11  NA
  B  12  NA
  B  13  NA

The resultant dataframe should take DF11 and then group the X_F column using ID column. Next it should complete X_F group-wise from 0 to the minimum value of X_F by group, and then from the maximum value of X_F to maximum value X_F 3.

I tried the following code and was able to solve it partially.

 DF112<-DF11%>%group_by(ID)%>%complete(X_F=seq(0, max(X_F) 3, by =1))

  ID X_F X_A
   A   0  NA
   A   1  NA
   A   2  NA
   A   3  NA
   A   4  NA
   A   5   7
   A   6  NA
   A   7   8
   A   8  NA
   A   9   9
   A  10  NA
   A  11  NA
   A  12  NA
   B   0  NA
   B   1  NA
   B   2  NA
   B   3  NA
   B   4  NA
   B   5  NA
   B   6   3
   B   7   6
   B   8   7
   B   9   9
   B  10  10
   B  11  NA
   B  12  NA
   B  13  NA

How do I get the desired output mentioned above. I request someone to guide me.

CodePudding user response:

It would work to pass two vectors into your complete function call, one to do the lower values and one to do the upper:

library(tidyr)
library(dplyr)

DF11 <- data.frame("ID" = c("A", "A", "A", "B", "B", "B", "B", "B"))
DF11$X_F <- c(5, 7, 9, 6, 7, 8, 9, 10)
DF11$X_A <- c(7, 8, 9, 3, 6, 7, 9, 10)

DF11 %>%
  group_by(ID) %>%
  complete(X_F = c(seq(0, min(X_F) - 1 , by = 1), seq(max(X_F)   1, max(X_F)   3, by = 1))) |> 
  arrange(ID, X_F)

# A tibble: 25 × 3
# Groups:   ID [2]
   ID      X_F   X_A
   <chr> <dbl> <dbl>
 1 A         0    NA
 2 A         1    NA
 3 A         2    NA
 4 A         3    NA
 5 A         4    NA
 6 A         5     7
 7 A         7     8
 8 A         9     9
 9 A        10    NA
10 A        11    NA
11 A        12    NA
12 B         0    NA
13 B         1    NA
14 B         2    NA
15 B         3    NA
16 B         4    NA
17 B         5    NA
18 B         6     3
19 B         7     6
20 B         8     7
21 B         9     9
22 B        10    10
23 B        11    NA
24 B        12    NA
25 B        13    NA

Created on 2022-11-01 with reprex v2.0.2

  • Related