Groupby agg keep blank value-CodePudding

Let say I have got a dataframe called df

The result I want is

     A    30
          35
     B    20

CodePudding user response：

I imagine your blanks are actually NaNs, then use dropna=False:

df.groupby('col1', dropna=False).sum()

If they really are empty strings, then it should work with the default.

Example:

df = pd.DataFrame({'col1': ['A', 'A', float('nan'), float('nan'), 'B', 'B'],
                   'col2': [10, 20, 15, 20, 10, 10]})
df.groupby('col1', dropna=False).sum()

output:

      col2
col1      
A       30
B       20
NaN     35

CodePudding user response：

Group by custom group and aggregate columns.

Suppose your dataframe with 2 columns: 'col1' and 'col2':

>>> df
  col1  col2
0    A    10  # <- group 1
1    A    20  # <- group 1
2         15  # <- group 2
3         20  # <- group 2
4    B    10  # <- group 3
5    B    10  # <- group 3

grp = df.iloc[:, 0].ne(df.iloc[:, 0].shift()).cumsum()
out = df.groupby(grp, as_index=False).agg({'col1': 'first', 'col2': 'sum'})

Output result:

>>> out
  col1  col2
0    A    30
1         35
2    B    20