Home > Mobile >  (Pandas) creating a column that counts the number of times a value in DFa occurs in DFb
(Pandas) creating a column that counts the number of times a value in DFa occurs in DFb

Time:04-08

I have two dataframes, one consists of people and scores, the other consists of each time one of the people did a thing.

df_people = pd.DataFrame(
    {
    'Name' : ['Angie', 'John', 'Joanne', 'Shivangi'],
    'ID' : ['0021', '0022', '0023', '0024'],
    'Code' : ['BHU', 'JNU', 'DU', 'BHU'],
    }
)

df_actions = pd.DataFrame(
    {
    'ID' : ['0023', '0021', '0022', '0021'],
    'Act' : ['Enter', 'Enter', 'Enter', 'Leave'],
    }
)

I would like to create a column in df_people that represents the count of each time they appear in df_actions based on the shared 'ID' column.

it would look like

Name ID Code Count
0 Angie 0023 BHU 1
1 John 0021 JNU 2
2 Joanne 0022 DU 1
3 Shivan 0024 BHU 0

I have tried just taking a value count and insterting that as a new column into df_people but it seems very clunky.

Any advice would be much appreciated.

CodePudding user response:

Another option is Series.map with Series.value_counts

new_df = df_people.assign(Count=df_people['ID'].map(df_actions['ID'].value_counts())
                                               .fillna(0, downcast='infer'))
print(new_df)
       Name    ID Code  Count
0     Angie  0021  BHU      2
1      John  0022  JNU      1
2    Joanne  0023   DU      1
3  Shivangi  0024  BHU      0

CodePudding user response:

Use first GroupBy.agg to compute the counts, then merge:

(df_people
 .merge(df_actions.groupby('ID')['Act'].agg(count='count'),
        left_on='ID', right_index=True, how='left')
 .fillna({'count': 0}, downcast='infer')
)

output:

       Name    ID Code  count
0     Angie  0021  BHU      2
1      John  0022  JNU      1
2    Joanne  0023   DU      1
3  Shivangi  0024  BHU      0
  • Related