Home > Software engineering >  How to count value type within a group with 0 if it is absent in the group?
How to count value type within a group with 0 if it is absent in the group?

Time:07-11

I have a dataframe:

id   value_type
1       b
1       a
1       a
2       a
3       a
3       b

I want to count value type within a group with 0 if it is absent in group.so desired result is:

id   value_type       cnt
1       b             1
1       a             2
2       a             1
2       b             0
3       a             1
3       b             1

How could I do that? I tried groupby().size() but it counts without 0 if it is absent in group

CodePudding user response:

Try .pivot_table and then .stack:

x = (
    df.pivot_table(
        index="id", columns="value_type", aggfunc="size", fill_value=0
    )
    .stack()
    .reset_index(name="cnt")
)
print(x)

Prints:

   id value_type  cnt
0   1          a    2
1   1          b    1
2   2          a    1
3   2          b    0
4   3          a    1
5   3          b    1

CodePudding user response:

Check below code:

import pandas as pd

df = pd.DataFrame({'col1':[1,1,1,2,3,3],'col2':['b','a','a','a','a','b']})

df_col1 = df[['col1']].drop_duplicates().assign(join_col = 1)

df_col2 = df[['col2']].drop_duplicates().assign(join_col = 1)

full_df = pd.merge(df_col1, df_col2,left_on='join_col', right_on='join_col').drop('join_col', axis=1)

final_df = pd.merge(full_df, df.assign(col3=df.col2), left_on=['col1','col2'], right_on=['col1','col2'], how='left').\
fillna(0)

final_df['count'] = np.where(final_df['col3']!=0,1, 0)

final_df.groupby(['col1','col2']).agg({'count':'sum'}).reset_index().sort_values(['col1','count'])

Output:

enter image description here

  • Related