Home > Enterprise >  Create a data frame that includes all dates between a range for each sku
Create a data frame that includes all dates between a range for each sku

Time:04-11

I am trying to create a data frame that includes all the dates between '1/1/2019' and '28/02/2022', for each of 3 countries and each of 9 SKUs. I am following this approach for days that is working fine:

days = pd.DataFrame(pd.date_range(start='1/1/2019', end='28/02/2022',freq='D'))
for i in range(26):
    days=days.append(pd.DataFrame(pd.date_range(start='1/1/2019', end='28/02/2022',freq='D')))
days.rename(columns={0:'Date'},inplace=True)
days.reset_index(inplace=True,drop=True)

This data frame gives a column with 31185 rows that correspond to the 1155 days between that range multiplied by 27 (3x9). Now I would like to assign the country name and sku name. To populate the countries I am following this manual approach, but it takes a while to load. Is there a smarter way to do it? I would like to do the same for the 9 SKUs.

for i in range(0,10395):
    days.loc[i,'country']= 'Austria'
for i in range(10395,20790):
    days.loc[i,'country']= 'Germany'
for i in range(20790,31185):
    days.loc[i,'country']= 'Switzerland'

Thanks in advance!

CodePudding user response:

Try days.loc[range(0,10395),'country']= 'Austria'

CodePudding user response:

A more efficient apporach is cross-multiply the 3 arrays representing Date, Country and SKU together:

df = pd.MultiIndex.from_product([
    pd.date_range(start='1/1/2019', end='28/02/2022', freq='D'),
    ['Austria', 'Germany', 'Switzerland'],
    [f'SKU {i}' for i in range(9)]
], names=['Date', 'Country', 'SKU']).to_frame(index=False)
  • Related