Home > Software engineering >  R studio: How to extend time series and fill in 0 for missing values?
R studio: How to extend time series and fill in 0 for missing values?

Time:10-17

I have two tables:

one is the "visits" table:

 -------- --------- -------- 
|  date  | user_id | visits |
 -------- --------- -------- 
| 1/1/18 |    A    |    2   |
 -------- --------- -------- 
| 1/2/18 |    A    |    4   |
 -------- --------- -------- 
| 1/3/18 |    A    |   10   |
 -------- --------- -------- 
| 1/4/18 |    A    |   34   |
 -------- --------- -------- 
| 1/5/18 |    A    |   23   |
 -------- --------- -------- 
| 1/1/18 |    B    |   15   |
 -------- --------- -------- 
| 1/2/18 |    B    |   12   |
 -------- --------- -------- 
| 1/1/18 |    C    |   10   |
 -------- --------- -------- 
| 1/1/18 |    D    |    5   |
 -------- --------- -------- 
| 1/2/18 |    D    |   12   |
 -------- --------- -------- 
| 1/3/18 |    D    |   15   |
 -------- --------- -------- 
| 1/4/18 |    D    |   25   |
 -------- --------- -------- 
| 1/1/18 |    E    |   18   |
 -------- --------- -------- 
| 1/1/18 |    G    |   21   |
 -------- --------- -------- 
| 1/2/18 |    G    |   10   |
 -------- --------- -------- 

Another one is the "location" table:

 --------- ---------- 
| user_id | location |
 --------- ---------- 
|    A    |     1    |
 --------- ---------- 
|    B    |     1    |
 --------- ---------- 
|    C    |     1    |
 --------- ---------- 
|    D    |     2    |
 --------- ---------- 
|    E    |     3    |
 --------- ---------- 
|    F    |     3    |
 --------- ---------- 
|    G    |     3    |
 --------- ---------- 

Note:

    1. If a user does not visit, he/she will not show up in the "visits" table. His/her visit is 0 that day.
    1. The "location" table has the complete population of users.

Question: I would like to extend the "visits" table, such that it looks like this:

 -------- --------- -------- 
| date   | user_id | visits |
 -------- --------- -------- 
| 1/1/18 |    A    | 2      |
 -------- --------- -------- 
| 1/2/18 |    A    | 4      |
 -------- --------- -------- 
| 1/3/18 |    A    | 10     |
 -------- --------- -------- 
| 1/4/18 |    A    | 34     |
 -------- --------- -------- 
| 1/5/18 |    A    | 23     |
 -------- --------- -------- 
| 1/1/18 |    B    | 15     |
 -------- --------- -------- 
| 1/2/18 |    B    | 12     |
 -------- --------- -------- 
| 1/3/18 |    B    | 0      |
 -------- --------- -------- 
| 1/4/18 |    B    | 0      |
 -------- --------- -------- 
| 1/5/18 |    B    | 0      |
 -------- --------- -------- 
| 1/1/18 |    C    | 10     |
 -------- --------- -------- 
| 1/2/18 |    C    | 0      |
 -------- --------- -------- 
| 1/3/18 |    C    | 0      |
 -------- --------- -------- 
| 1/4/18 |    C    | 0      |
 -------- --------- -------- 
| 1/5/18 |    C    | 0      |
 -------- --------- -------- 
| 1/1/18 |    D    | 5      |
 -------- --------- -------- 
| 1/2/18 |    D    | 12     |
 -------- --------- -------- 
| 1/3/18 |    D    | 15     |
 -------- --------- -------- 
| 1/4/18 |    D    | 25     |
 -------- --------- -------- 
| 1/5/18 |    D    | 0      |
 -------- --------- -------- 
| 1/1/18 |    E    | 18     |
 -------- --------- -------- 
| 1/2/18 |    E    | 0      |
 -------- --------- -------- 
| 1/3/18 |    E    | 0      |
 -------- --------- -------- 
| 1/4/18 |    E    | 0      |
 -------- --------- -------- 
| 1/5/18 |    E    | 0      |
 -------- --------- -------- 
| 1/1/18 |    F    | 0      |
 -------- --------- -------- 
| 1/2/18 |    F    | 0      |
 -------- --------- -------- 
| 1/3/18 |    F    | 0      |
 -------- --------- -------- 
| 1/4/18 |    F    | 0      |
 -------- --------- -------- 
| 1/5/18 |    F    | 0      |
 -------- --------- -------- 
| 1/1/18 |    G    | 21     |
 -------- --------- -------- 
| 1/2/18 |    G    | 10     |
 -------- --------- -------- 
| 1/3/18 |    G    | 0      |
 -------- --------- -------- 
| 1/4/18 |    G    | 0      |
 -------- --------- -------- 
| 1/5/18 |    G    | 0      |
 -------- --------- -------- 

A table in this format is easier for me to do further analysis with the whole population in one table.

I would like to code this in R, ideally using tidyverse. I can't wrap my head around how to achieve this. Appreciate any insights and help into this. Thanks so much!

CodePudding user response:

We may need complete here

library(dplyr)
library(tidyr)
visits %>% 
   complete(date, user_id = location$user_id, fill = list(visits = 0))
  • Related