having a simple DataFrame like this one:
items
0 apple
1 car
2 tree
3 house
4 light
5 camera
6 laptop
7 watch
8 other
what would be a good way to count the times the individual words occur in a string in DF2:
id items
0 555 wall, grass, apple
1 124 bag, coffee, light
2 23123 bag, none, game
3 666 none, none, none
resulting in something like this
items count
0 apple 1
1 car 0
2 tree 0
3 house 0
4 light 1
5 camera 0
6 laptop 0
7 watch 0
8 other 0
CodePudding user response:
Let's try value_counts
the exploded items
columns of df2
then map the result to items
column of df1
df1['count'] = (df1['items'].map(df2['items'].str.split(', ')
.explode()
.value_counts())
.fillna(0).astype(int))
print(df1)
items count
0 apple 1
1 car 0
2 tree 0
3 house 0
4 light 1
5 camera 0
6 laptop 0
7 watch 0
8 other 0
CodePudding user response:
I would split the items, explode, then reindex:
(df2['items'].str.split(', ')
.explode().value_counts()
.reindex(df1['items'], fill_value=0)
.reset_index(name='count')
)
Output:
items count
0 apple 1
1 car 0
2 tree 0
3 house 0
4 light 1
5 camera 0
6 laptop 0
7 watch 0
8 other 0