I have two dataframes, df_rates and df_profit as shown below. df_rates has a time-date value its column name with values as certain rates and the index values denotes the minutes before the time-date value of the column. (i.e. row 1 denotes 0 mins before 2012-03-31 23:45:00, row 2 denotes 5 mins before 2012-03-31 23:45:00 and so on). Whereas df_profit has timestamps as its index and there is a Profit column.
I want to add the Profit column from df_profit as a new column to df_rates under the following condition:
If the column name of df_rates is '2012-03-31 23:45:00', then find the timestamp that is 30 mins before it i.e. '2012-03-31 23:15:00' in the index of df_profit, then populate the new column with the corresponding profit value (-21.48) at row where 'Mins before time' is 0.
The next value (-8.538) in line from Profit column of df_profit, where the timestamp is ‘2012-03-31 23:00:00’ should be populated in the new column against row where ‘Mins before time’ is 15 and so on.
With some help I implemented the below code but that grabs and populates the value from the matching timestamp on df_profit index. I am unsure on how to grab value from the df_profit index that is 30 mins before the column name of df_rates. Could someone kindly help?
df_rates
Mins before time 2012-03-31 23:45:00
0 113.1
5 112.1
10 113.1
15 113.17
20 103.17
25 133.17
30 101.39
df_profit
Profit
2012-04-01 00:30:00 251.71
2012-04-01 00:15:00 652.782
2012-04-01 00:00:00 458.099
2012-03-31 23:45:00 3504.664
2012-03-31 23:30:00 1215.76
2012-03-31 23:15:00 -21.48
2012-03-31 23:00:00 -8.538
2012-03-31 22:40:00 -5.11
Expected dataframe:
Mins before time 2022-01-31 23:45:00 01:00 New_column
0 113.1 -21.48
5 112.1
10 113.1
15 113.17 -8.538
20 103.17
25 133.17
30 101.39 -5.11
Implemented code :
df_rates['New column'] = df_profit.Profit.reindex(pd.to_datetime(df_rates.columns[-1]) - pd.to_timedelta(df_rates['Mins before time'], unit='min')).to_numpy()
CodePudding user response:
Like this:
anchor_time = df_rates.columns[-1]
lookback_minutes = 30
df_rates = ( df_rates
.set_index(anchor_time - pd.to_timedelta(df_rates['Mins before time'] lookback_minutes, unit='min'))
.join(df_profit).reset_index(drop=True).rename(columns={'Profit':'New_column'}) )
Output:
Mins before time 2012-03-31 23:45:00 New_column
0 0 113.10 -21.480
1 5 112.10 NaN
2 10 113.10 NaN
3 15 113.17 -8.538
4 20 103.17 NaN
5 25 133.17 NaN
6 30 101.39 -5.110