My dataframe looks like this:
data1 = {'date': '2018-09-14',
'base_currency': 'EUR',
'target_currency': ['NZD', 'AUD', 'HKD'],
'exchange_rate': [11.7778,1.6244,9.1755] }
data2 = {'date': '2018-09-17',
'base_currency': 'EUR',
'target_currency': ['NZD', 'AUD', 'HKD'],
'exchange_rate': [1.7742,1.6258,9.1579] }
data3 = {'date': '2018-09-18',
'base_currency': 'EUR',
'target_currency': ['NZD', 'AUD', 'HKD'],
'exchange_rate': [1.7752,1.6260,9.1725] }
df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)
df3 = pd.DataFrame(data3)
df = pd.concat([df1,df2, df3], ignore_index=True)
df
date base target exchange_rate
0 2018-09-14 EUR NZD 11.7778
1 2018-09-14 EUR AUD 1.6244
2 2018-09-14 EUR HKD 9.1755
3 2018-09-17 EUR NZD 1.7742
4 2018-09-17 EUR AUD 1.6258
5 2018-09-17 EUR HKD 9.1579
6 2018-09-18 EUR NZD 1.7752
7 2018-09-18 EUR AUD 1.6260
8 2018-09-18 EUR HKD 9.1725
I want to fill in 2018-09-15 and 2018-09-16 with the values for the previous date, 2018-09-14 for each currency.
I have converted the dates to the datetime type, created an index and generated a currency list.
df['date'] = pd.to_datetime(df['date'], format='%Y/%m/%d')
idx = pd.date_range('09-14-2018', '09-18-2018')
curr_lst = df.target_currency.unique().tolist()
Can you help me with a loop from here? I probably need to loop through each currency, and do a ffill().
I can do it with one currency only:
df_nzd = df.loc[df.target_currency == 'NZD']
df_nzd.set_index('date', inplace=True)
df_nzd = df_nzd.resample('D').ffill().reset_index()
df_nzd
date base target exchange_rate
0 2018-09-14 EUR NZD 11.7778
1 2018-09-15 EUR NZD 11.7778
2 2018-09-16 EUR NZD 11.7778
3 2018-09-17 EUR NZD 1.7742
4 2018-09-18 EUR NZD 1.7752
Of course, there may be a better way to do it.
CodePudding user response:
You've already found almost all methods you can use to achieve what you want. You can use unstack/stack to apply on all currencies:
data1 = {'date': '2018-09-14',
'base_currency': 'EUR',
'target_currency': ['NZD', 'AUD', 'HKD'],
'exchange_rate': [11.7778,1.6244,9.1755] }
data2 = {'date': '2018-09-17',
'base_currency': 'EUR',
'target_currency': ['NZD', 'AUD', 'HKD'],
'exchange_rate': [1.7742,1.6258,9.1579] }
data3 = {'date': '2018-09-18',
'base_currency': 'EUR',
'target_currency': ['NZD', 'AUD', 'HKD'],
'exchange_rate': [1.7752,1.6260,9.1725] }
df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)
df3 = pd.DataFrame(data3)
df = pd.concat([df1,df2, df3], ignore_index=True)
df['date'] = pd.to_datetime(df['date'])
df.set_index(['date', 'target_currency'], inplace=True)
df = df.unstack(level=-1)
df = df.resample('D').ffill()
df = df.stack()
print(df)
Output:
base_currency exchange_rate
date target_currency
2018-09-14 AUD EUR 1.6244
HKD EUR 9.1755
NZD EUR 11.7778
2018-09-15 AUD EUR 1.6244
HKD EUR 9.1755
NZD EUR 11.7778
2018-09-16 AUD EUR 1.6244
HKD EUR 9.1755
NZD EUR 11.7778
2018-09-17 AUD EUR 1.6258
HKD EUR 9.1579
NZD EUR 1.7742
2018-09-18 AUD EUR 1.6260
HKD EUR 9.1725
NZD EUR 1.7752
CodePudding user response:
You can craft a MultiIndex and reindex
, then ffill
per target:
date_lst = pd.date_range('09-14-2018', '09-18-2018')
curr_lst = df.target_currency.unique().tolist()
idx = pd.MultiIndex.from_product([date_lst, curr_lst], names=['date', 'target_currency'])
df2 = (df
.set_index(['date', 'target_currency'])
.reindex(idx)
.groupby(level='target_currency').ffill()
.reset_index()
)
Output:
date target_currency base_currency exchange_rate
0 2018-09-14 NZD EUR 11.7778
1 2018-09-14 AUD EUR 1.6244
2 2018-09-14 HKD EUR 9.1755
3 2018-09-15 NZD EUR 11.7778
4 2018-09-15 AUD EUR 1.6244
5 2018-09-15 HKD EUR 9.1755
6 2018-09-16 NZD EUR 11.7778
7 2018-09-16 AUD EUR 1.6244
8 2018-09-16 HKD EUR 9.1755
9 2018-09-17 NZD EUR 1.7742
10 2018-09-17 AUD EUR 1.6258
11 2018-09-17 HKD EUR 9.1579
12 2018-09-18 NZD EUR 1.7752
13 2018-09-18 AUD EUR 1.6260
14 2018-09-18 HKD EUR 9.1725