Home > Blockchain >  MultiIndex Series Adding Back to DataFrame
MultiIndex Series Adding Back to DataFrame

Time:11-02

I am trying to create a consolidated report from a larger DataFrame. I am having issues with mapping a MultiIndex series back to a DataFrame. I can get it working for Single Index Series. I want to explicitly use two columns as the criteria for the groupby and the use those two columns to map the series back to the dataFrame so that I know it is being mapped correctly in case a single DealNum has multiple FlowDates for the Interest and Premium FlowTypes.

Any suggestions on how I can accomplish this properly and if this is the best way to approach this? I've tried pivot_table but there are some cases in the larger data set where this doesn't work well.

Thanks.

Full code is below, the specific line I am having issues with is:

report['Interest-FlowDate-Daily_PnL'] = report[['DealNum','FlowDate']].map(interest_flowdate_series)
import pandas as pd

data = {
        'DealNum': [1,1,1,2,2,2,3,3,3,4,4,4,5,5,5,6,6,6],
        'FlowType': ['Interest', 'Premium', 'Fees',
               'Interest', 'Premium', 'Fees',
               'Interest', 'Premium', 'Fees',
               'Interest', 'Premium', 'Fees',
               'Interest', 'Premium', 'Fees',
               'Interest', 'Premium', 'Fees'],
        'FlowDate': ['12/31/2021', '12/31/2021', '1/10/2021',
              '11/30/2021', '11/30/2021', '2/8/2021',
              '1/31/2022', '1/31/2022', '5/10/2021',
              '2/28/2022', '2/28/2022', '7/28/2021',
              '12/15/2021', '12/15/2021', '12/15/2021',
              '6/30/2022', '6/30/2022', '11/10/2021',],
        'Daily_PnL': [10,0,-2.25,
                      30,0,-1.50,
                      100,0,-3.50,
                      200,0,-2.50,
                      300,0,-4.50,
                      0,-150,-1.50,],
        'LTD_PnL':  [1000,-150,-102.25,
                    3000,0,-101.50,
                    1400,0,-103.50,
                    2400,0,-102.50,
                    4000,0,-104.50,
                    0,-150,-1.50,]
}

df = pd.DataFrame(data)

# Create Base Report using only FlowDates from Interest and Premium FlowTypes
report = df[(df.FlowType == 'Interest') | (df.FlowType == 'Premium')][['DealNum','FlowDate']].drop_duplicates().reset_index(drop=True)

#Create Interest Series to add back to Base Report using DealNum & FlowDate as criteria
interest_flowdate_series = df[df.FlowType == 'Interest'].groupby(['DealNum','FlowDate']).sum()['Daily_PnL']
report['Interest-FlowDate-Daily_PnL'] = report[['DealNum','FlowDate']].map(interest_flowdate_series)

print(interest_flowdate_series)
print(report)

CodePudding user response:

I am not sure what your exact goal is, but it looks like you're using map as a way to merge/join. Why not doing this?

report.set_index(['DealNum','FlowDate']).join(interest_flowdate_series).reset_index()

Or:

report.merge(interest_flowdate_series, left_on=['DealNum','FlowDate'], right_index=True)

Output:

   DealNum    FlowDate  Fees-Daily_PnL  Interest-Daily_PnL  Daily_PnL
0        1  12/31/2021           -2.25                10.0       10.0
1        2  11/30/2021           -1.50                30.0       30.0
2        3   1/31/2022           -3.50               100.0      100.0
3        4   2/28/2022           -2.50               200.0      200.0
4        5  12/15/2021           -4.50               300.0      300.0
5        6   6/30/2022           -1.50                 0.0        0.0
  • Related