I have a dataframe (df
):
| A | B | C |
| --- | ----- | ----------------------- |
| CA | Jon | [sales, engineering] |
| NY | Sarah | [engineering, IT] |
| VA | Vox | [services, engineering] |
I am trying to group by each item in the C
column list (sales, engineering, IT, etc.).
Tried:
df.groupby('C')
but got list not hashable, which is expected. I came across another post where it was recommended to convert the C
column to tuple which is hashable, but I need to groupby each item and not the combination.
My goal is to get the count of each row in the df
for each item in the C
column list. So:
sales: 1
engineering: 3
IT: 1
services: 1
While there is probably a simpler way to obtain this than using groupby
, I am still curious if groupby
can be used in this case.
CodePudding user response:
You can explode
& value_counts
:
out = df.explode("C").value_counts("C")
Output :
print(out)
C
engineering 3
IT 1
sales 1
services 1
dtype: int64