summing by keywords and by groups in pandas-CodePudding

I have a following problem:

a dataframe with keywords and groups:

my task is to look for these keywords in another dataframe in the given description and calculate the occurrences in those descriptions - by adding columns. <- I've done that part, using the code:

keywords = df["keyword"].to_list()
for key in keywords:
    new_df[key] = new_df["description"].str.lower().str.count(key)

and the result df is:

I would like to sum them by the group given in keywords dataframe. So all the columns within one group : gas/gases/gasoline change into column gas and sum the results.

CodePudding user response：

If need aggregate values by another column group is possible create dictionary of DataFrames with tuples in keys for groups names, join by concat, aggreagte and add to existing DataFrame:

d = {(key, g): df["description"].str.lower().str.count(key) 
     for key, g in zip(df["keyword"], df['group'])}
  
df = df.join(pd.concat(d, axis=1).groupby(level=1, axis=1).sum())