I am trying to create a consolidated report from a larger DataFrame. I am having issues with mapping a MultiIndex series back to a DataFrame. I can get it working for Single Index Series. I want to explicitly use two columns as the criteria for the groupby and the use those two columns to map the series back to the dataFrame so that I know it is being mapped correctly in case a single DealNum has multiple FlowDates for the Interest and Premium FlowTypes.
Any suggestions on how I can accomplish this properly and if this is the best way to approach this? I've tried pivot_table but there are some cases in the larger data set where this doesn't work well.
Thanks.
Full code is below, the specific line I am having issues with is:
report['Interest-FlowDate-Daily_PnL'] = report[['DealNum','FlowDate']].map(interest_flowdate_series)
import pandas as pd
data = {
'DealNum': [1,1,1,2,2,2,3,3,3,4,4,4,5,5,5,6,6,6],
'FlowType': ['Interest', 'Premium', 'Fees',
'Interest', 'Premium', 'Fees',
'Interest', 'Premium', 'Fees',
'Interest', 'Premium', 'Fees',
'Interest', 'Premium', 'Fees',
'Interest', 'Premium', 'Fees'],
'FlowDate': ['12/31/2021', '12/31/2021', '1/10/2021',
'11/30/2021', '11/30/2021', '2/8/2021',
'1/31/2022', '1/31/2022', '5/10/2021',
'2/28/2022', '2/28/2022', '7/28/2021',
'12/15/2021', '12/15/2021', '12/15/2021',
'6/30/2022', '6/30/2022', '11/10/2021',],
'Daily_PnL': [10,0,-2.25,
30,0,-1.50,
100,0,-3.50,
200,0,-2.50,
300,0,-4.50,
0,-150,-1.50,],
'LTD_PnL': [1000,-150,-102.25,
3000,0,-101.50,
1400,0,-103.50,
2400,0,-102.50,
4000,0,-104.50,
0,-150,-1.50,]
}
df = pd.DataFrame(data)
# Create Base Report using only FlowDates from Interest and Premium FlowTypes
report = df[(df.FlowType == 'Interest') | (df.FlowType == 'Premium')][['DealNum','FlowDate']].drop_duplicates().reset_index(drop=True)
#Create Interest Series to add back to Base Report using DealNum & FlowDate as criteria
interest_flowdate_series = df[df.FlowType == 'Interest'].groupby(['DealNum','FlowDate']).sum()['Daily_PnL']
report['Interest-FlowDate-Daily_PnL'] = report[['DealNum','FlowDate']].map(interest_flowdate_series)
print(interest_flowdate_series)
print(report)
CodePudding user response:
I am not sure what your exact goal is, but it looks like you're using map
as a way to merge
/join
. Why not doing this?
report.set_index(['DealNum','FlowDate']).join(interest_flowdate_series).reset_index()
Or:
report.merge(interest_flowdate_series, left_on=['DealNum','FlowDate'], right_index=True)
Output:
DealNum FlowDate Fees-Daily_PnL Interest-Daily_PnL Daily_PnL
0 1 12/31/2021 -2.25 10.0 10.0
1 2 11/30/2021 -1.50 30.0 30.0
2 3 1/31/2022 -3.50 100.0 100.0
3 4 2/28/2022 -2.50 200.0 200.0
4 5 12/15/2021 -4.50 300.0 300.0
5 6 6/30/2022 -1.50 0.0 0.0