I have a following problem:
a dataframe with keywords and groups:
my task is to look for these keywords in another dataframe in the given description and calculate the occurrences in those descriptions - by adding columns. <- I've done that part, using the code:
keywords = df["keyword"].to_list()
for key in keywords:
new_df[key] = new_df["description"].str.lower().str.count(key)
and the result df is:
I would like to sum them by the group given in keywords dataframe. So all the columns within one group : gas/gases/gasoline change into column gas and sum the results.
CodePudding user response:
If need aggregate values by another column group
is possible create dictionary of DataFrames with tuples in keys for groups names, join by concat
, aggreagte and add to existing DataFrame
:
d = {(key, g): df["description"].str.lower().str.count(key)
for key, g in zip(df["keyword"], df['group'])}
df = df.join(pd.concat(d, axis=1).groupby(level=1, axis=1).sum())