How could I calculate the value counts within a string col?
df col
0 fruit["apple"], colour["green", "yellow" ]
1 colour["yellow"]
2 colour["brown"]
Expected Output
fruit 1
colour 3
CodePudding user response:
Use Series.str.extractall
with substrings joined by |
for regex or:
s = df['col'].str.extractall('(fruit|colour)')[0].value_counts()
print (s)
colour 3
fruit 1
Name: 0, dtype: int64
Or get words before [
for more dynamic solution:
s = df['col'].str.extractall(r'(\w )\[')[0].value_counts()
print (s)
colour 3
fruit 1
dtype: int64