Home > Enterprise >  Impute missing records based on week and year in r
Impute missing records based on week and year in r

Time:08-25

I want to impute missing weeks record with 0 values in duration column for each household, individual combination.

enter image description here

The minimum week here is w51 of 2021 and goes upto w4 of 2022

For Household 1001 - individual 1 combination, week 3 is missing in the sequence.

Household 1002 - individual 2, week 52,week 2 and week 4 is missing

Final dataset would be:

enter image description here

what I tried is using complete function from tidyr after group by with household and individual but its not working.

In actual dataset minimium and maximum weeks will be changing.

Here is the sample dataset

 data <- data.frame(household=c(1001,1001,1001,1001,1001,1002,1002,1002,1003,1003,1003),
               individual = c(1,1,1,1,1,2,2,2,1,1,1),
               year = c(2021,2021,2022,2022,2022,2021,2022,2022,2022,2022,2022),
               week =c("w51","w52","w1","w2","w4","w51","w1","w3","w1","w2","w3"),
               duration =c(20,23,24,56,78,12,34,67,87,89,90))

CodePudding user response:

Using the examples on the ?complete help page, you can use nesting() to give you what you want

data %>% 
  complete(nesting(household, individual), nesting(year, week), fill=list(duration=0))
#    household individual  year week  duration
#        <dbl>      <dbl> <dbl> <chr>    <dbl>
#  1      1001          1  2021 w51         20
#  2      1001          1  2021 w52         23
#  3      1001          1  2022 w1          24
#  4      1001          1  2022 w2          56
#  5      1001          1  2022 w3           0
#  6      1001          1  2022 w4          78
#  7      1002          2  2021 w51         12
#  8      1002          2  2021 w52          0
#  9      1002          2  2022 w1          34
# 10      1002          2  2022 w2           0
# 11      1002          2  2022 w3          67
# 12      1002          2  2022 w4           0
# 13      1003          1  2021 w51          0
# 14      1003          1  2021 w52          0
# 15      1003          1  2022 w1          87
# 16      1003          1  2022 w2          89
# 17      1003          1  2022 w3          90
# 18      1003          1  2022 w4           0
  •  Tags:  
  • r
  • Related