Home > database >  How do i iterate over rows in a data frame then append date range from 2020 to 2023 in pandas to eac
How do i iterate over rows in a data frame then append date range from 2020 to 2023 in pandas to eac

Time:04-14

Hi i have a simple dataframe consisting of the employee, and how much they have to hit for their sales. The column, dailytarget.

df = pd.DataFrame({'EmployeeNo':[11111,11112,11113,11115,11116,11128],
                   'OutletName':['Outlet1', 'Outlet2', 'Outlet3','Outlet4', 'Outlet5','Outlet6'],
                   'EmployeeName':['John','Tom','Bob','Sam', 'Sean', 'Zac'],
                   'TargetAmount':[1000,500,400,500,300,800]})
df

So i would need to add the daterange 2020 to 2023, then for each date, append the dataframe. For example,

(2020-01-01,11111,outlet1,Tom,$1,000)
(2020-01-02,11111,outlet1,Tom,$1,000)
(2020-01-03,11111,outlet1,Tom,$1,000)
(2020-01-04,11111,outlet1,Tom,$1,000)
.....
(2023-12-31,11111,outlet1,Tom,$1,000)
****Then move on to the next employee****
(2020-01-01,11112,outlet2,John,$500)
(2020-01-02,11112,outlet2,John,$500)
(2020-01-03,11112,outlet2,John,$500)
...
(2023-12-31,11112,outlet2,John,$500)

and so on for all employees. How do i do this?

I thought about getting pd.date_range('2020-01-01', '2023-12-31')

but i couldnt figure out how to concat or append them. Please advise thank you!

CodePudding user response:

First off, thank you for formatting the question so nicely!

My guess is that you're close, but perhaps running into an error like:

ValueError: Length of values (1461) does not match length of index (6)

To get past this, and get your desired result I did:

import pandas as pd

df = pd.DataFrame({'EmployeeNo':[11111,11112,11113,11115,11116,11128],
                   'OutletName':['Outlet1', 'Outlet2', 'Outlet3','Outlet4', 'Outlet5','Outlet6'],
                   'EmployeeName':['John','Tom','Bob','Sam', 'Sean', 'Zac'],
                   'TargetAmount':[1000,500,400,500,300,800]})

df['date'] = pd.Series([pd.date_range('2020-01-01', '2023-12-31') for x in range(df.shape[0])])
df = df.explode('date')
print(df)

Output:

    EmployeeNo OutletName EmployeeName  TargetAmount       date
0        11111    Outlet1         John          1000 2020-01-01
0        11111    Outlet1         John          1000 2020-01-02
0        11111    Outlet1         John          1000 2020-01-03
0        11111    Outlet1         John          1000 2020-01-04
0        11111    Outlet1         John          1000 2020-01-05
..         ...        ...          ...           ...        ...
5        11128    Outlet6          Zac           800 2023-12-27
5        11128    Outlet6          Zac           800 2023-12-28
5        11128    Outlet6          Zac           800 2023-12-29
5        11128    Outlet6          Zac           800 2023-12-30
5        11128    Outlet6          Zac           800 2023-12-31

[8766 rows x 5 columns]
  • Related