Pandas Merging rows and summing values?-CodePudding

I have some earthquake data. I have a Magnitude, Distance, and Percent that I care about. I want to group all of the MAGNITUDES together and sum the distances and percents for each magnitudes. Here is a part of my data:

import pandas as pd

data = {'Distance': [1, 5, 9, 3, 5, 4, 2, 3.1],
        'Magnitude': [7.3, 7.3, 7.3, 6.0, 8.2, 6.0, 8.2, 5.7],
        'Percent': [0.1, 0.05, 0.07, 0.11, 0.2, 0.07, 0.08,0.11]
       }

df = pd.DataFrame(data)

print(df)


         Distance  Magnitude  Percent
 0       1.0        7.3     0.10
 1       5.0        7.3     0.05
 2       9.0        7.3     0.07
 3       3.0        6.0     0.11
 4       5.0        8.2     0.20
 5       4.0        6.0     0.07
 6       2.0        8.2     0.08
 7       3.1        5.7     0.11

My idea was this. Groupby and sum:

df2 = df.groupby(['Distance','Magnitude','Percent'],as_index=False).agg({'Percent': 'sum'},{'Distance': 'sum'})

I get the same dataframe upon running my code except it is ascending by distance which is fine, but nothing groupped together or summed.

I want it to look like this:

       Distance  Magnitude  Percent
0      15.0        5.7     0.22
1       7.0        6.0     0.18
2       7.0        7.3     0.28
3       3.1        8.2     0.11

There is only 1 value for each magnitude and the distances and percents have been summed for each magnitude.

CodePudding user response：

This will do the the task, you just need to groupby magnitude only

df.groupby(by=["Magnitude"]).sum()

Output

           Distance  Percent
Magnitude                   
5.7             3.1     0.11
6.0             7.0     0.18
7.3            15.0     0.22
8.2             7.0     0.28

Or to prevent Magnitude becoming an index as per @lsr729 you can use this as well

df.groupby(by=["Magnitude"], as_index=False).sum()

Output2

   Magnitude  Distance  Percent
0        5.7       3.1     0.11
1        6.0       7.0     0.18
2        7.3      15.0     0.22
3        8.2       7.0     0.28