I have some earthquake data. I have a Magnitude, Distance, and Percent that I care about. I want to group all of the MAGNITUDES together and sum the distances and percents for each magnitudes. Here is a part of my data:
import pandas as pd
data = {'Distance': [1, 5, 9, 3, 5, 4, 2, 3.1],
'Magnitude': [7.3, 7.3, 7.3, 6.0, 8.2, 6.0, 8.2, 5.7],
'Percent': [0.1, 0.05, 0.07, 0.11, 0.2, 0.07, 0.08,0.11]
}
df = pd.DataFrame(data)
print(df)
Distance Magnitude Percent
0 1.0 7.3 0.10
1 5.0 7.3 0.05
2 9.0 7.3 0.07
3 3.0 6.0 0.11
4 5.0 8.2 0.20
5 4.0 6.0 0.07
6 2.0 8.2 0.08
7 3.1 5.7 0.11
My idea was this. Groupby and sum:
df2 = df.groupby(['Distance','Magnitude','Percent'],as_index=False).agg({'Percent': 'sum'},{'Distance': 'sum'})
I get the same dataframe upon running my code except it is ascending by distance which is fine, but nothing groupped together or summed.
I want it to look like this:
Distance Magnitude Percent
0 15.0 5.7 0.22
1 7.0 6.0 0.18
2 7.0 7.3 0.28
3 3.1 8.2 0.11
There is only 1 value for each magnitude and the distances and percents have been summed for each magnitude.
CodePudding user response:
This will do the the task, you just need to groupby magnitude only
df.groupby(by=["Magnitude"]).sum()
Output
Distance Percent
Magnitude
5.7 3.1 0.11
6.0 7.0 0.18
7.3 15.0 0.22
8.2 7.0 0.28
Or to prevent Magnitude becoming an index as per @lsr729 you can use this as well
df.groupby(by=["Magnitude"], as_index=False).sum()
Output2
Magnitude Distance Percent
0 5.7 3.1 0.11
1 6.0 7.0 0.18
2 7.3 15.0 0.22
3 8.2 7.0 0.28