Home > Software engineering >  How do you give a date range then have that daterange be appended to the dataframe?
How do you give a date range then have that daterange be appended to the dataframe?

Time:10-21

I know how to generate a daterange using this code:

pd.date_range(start='2022-10-16', end='2022-10-19')

How do I get the daterange result above and loop through every locations in the below dataframe?

 ---------- 
| Location |
 ---------- 
| A        |
| B        |
| C        |
 ---------- 

This is the result I want.

 ---------- ------------ 
| Location |    Date    |
 ---------- ------------ 
| A        | 2022/10/16 |
| A        | 2022/10/17 |
| A        | 2022/10/18 |
| A        | 2022/10/19 |
| B        | 2022/10/16 |
| B        | 2022/10/17 |
| B        | 2022/10/18 |
| B        | 2022/10/19 |
| C        | 2022/10/16 |
| C        | 2022/10/17 |
| C        | 2022/10/18 |
| C        | 2022/10/19 |
 ---------- ------------ 

I have spent the whole day figuring this out. Any help would be appreciated!

CodePudding user response:

You can cross join your date range and dataframe to get your desired result:

date_range = (pd.date_range(start='2022-10-16', end='2022-10-19')
                .rename('Date')
                .to_series())

df = df.merge(date_range, 'cross')
print(df)

Output:

   Location       Date
0         A 2022-10-16
1         A 2022-10-17
2         A 2022-10-18
3         A 2022-10-19
4         B 2022-10-16
5         B 2022-10-17
6         B 2022-10-18
7         B 2022-10-19
8         C 2022-10-16
9         C 2022-10-17
10        C 2022-10-18
11        C 2022-10-19

CodePudding user response:

You seem to be looking for a cartesian product of two iterables, which is something itertools.product can do. Take a look at this article.

In your case, you can try:

import pandas as pd
from itertools import product

# Test data:
df = pd.DataFrame(['A', 'B', 'C'], columns=['Location'])
dr = pd.date_range(start='2022-10-16', end='2022-10-19')

# Create the cartesian product:
res_df = pd.DataFrame(product(df['Location'], dr), columns=['Location', 'Date'])
print(res_df)
  • Related