Let say I have got a dataframe called df
A 10
A 20
15
20
B 10
B 10
The result I want is
A 30
35
B 20
CodePudding user response:
I imagine your blanks are actually NaN
s, then use dropna=False
:
df.groupby('col1', dropna=False).sum()
If they really are empty strings, then it should work with the default.
Example:
df = pd.DataFrame({'col1': ['A', 'A', float('nan'), float('nan'), 'B', 'B'],
'col2': [10, 20, 15, 20, 10, 10]})
df.groupby('col1', dropna=False).sum()
output:
col2
col1
A 30
B 20
NaN 35
CodePudding user response:
Group by custom group and aggregate columns.
Suppose your dataframe with 2 columns: 'col1' and 'col2':
>>> df
col1 col2
0 A 10 # <- group 1
1 A 20 # <- group 1
2 15 # <- group 2
3 20 # <- group 2
4 B 10 # <- group 3
5 B 10 # <- group 3
grp = df.iloc[:, 0].ne(df.iloc[:, 0].shift()).cumsum()
out = df.groupby(grp, as_index=False).agg({'col1': 'first', 'col2': 'sum'})
Output result:
>>> out
col1 col2
0 A 30
1 35
2 B 20