I have multiple dataframes with same columns. First one is df1
Name | I |
---|---|
Jack | 1.0 |
Louis | 1.0 |
Jack | 2.0 |
Louis | 5.0 |
Jack | 4.0 |
Mark | 2.0 |
- | - |
Mark | 3.0 |
df_2
Name | I |
---|---|
Jack | 3.0 |
Louis | 3.0 |
Jack | 2.0 |
Louis | 1.0 |
Jack | 6.0 |
Mark | 7.0 |
- | - |
Mark | 3.0 |
I should have a new dataframe ndf as
Name | res_df1 | res_df2 |
---|---|---|
Jack | 7.0 | 11.0 |
Louis | 6.0 | 4.0 |
Mark | 5.0 | 10.0 |
res_df1 and res_df2 are the sum grouped by name from corresponding dataframes. How to get res table. How to match the sum of group results from different dataframes and write the sum result to the corresponding group in new df. I have done like this:
frames =[df1, df2, ...df9]
ndf = pd.concat(frames)
ndf = ndf.drop_duplicates('Name')
ndf['res_df1'] = df1.groupby(['Name', sort=False)[I'].transform('sum').round(2)
ndf['res_df2'] = df2.groupby(['Name', sort=False)[I'].transform('sum').round(2)
---
ndf['res_df9'] = df9.groupby(['Name', sort=False)[I'].transform('sum').round(2)
But the problem is I can't get right sum.
CodePudding user response:
Try:
frames = [df_1, df_2]
final_df = pd.DataFrame()
for index, df in enumerate(frames):
df_count = df.groupby('Name')['l'].sum().reset_index(name=f'res_df{index}')
if index==0:
final_df = df_count.copy(deep=True)
else:
final_df = final_df.merge(df_count, how='outer', on='Name')
print(final_df)