I have some data like this:
0 Very user friendly interface and has 2FA support
1 The trading page is great though with allot o...
2 Widget support
3 But it’s really only for serious traders with...
4 The KYC and AML process is painful - it took ...
...
937 Legit platform!
938 Horrible customer service won’t get back to m...
939 App is fast and reliable
940 I wish it had a portfolio chart though
941 The app isn’t as user friendly as it need to b...
Name: reviews, Length: 942, dtype: object
and features:
['support',
'time',
'follow',
'submit',
'ticket',
'team',
'swap',
'account',
'experi',
'contact',
'user',
'platform',
'screen',
'servic',
'custom',
'restrict',
'fast',
'portfolio',
'specialist']
I want to check if one or more of features in reviews add that words in new column.
and my code is this:
data["words"] = data[data["reviews"].str.contains('|'.join(features))]
but this code make new column with name "words" however because sometime code return multi value so I get error
ValueError: Columns must be same length as key
how can fix it?
CodePudding user response:
The issue is that you are not actually extracting any of the words. You need to pull the words you want out of the text and then cat them into a new column.
import pandas as pd
from io import StringIO
import re
TESTDATA = StringIO("""Index,reviews,
0, Very user friendly interface and has 2FA support,
1, The trading page is great though with allot o...,
2, Widget support,
3, But it’s really only for serious traders with...,
4, The KYC and AML process is painful - it took ...,
937, Legit platform!,
938, Horrible customer service won’t get back to m...,
939, App is fast and reliable,
940, I wish it had a portfolio chart though,
941, The app isn’t as user friendly as it need to b...
""")
data = pd.read_csv(TESTDATA, sep=",").drop('Unnamed: 2', axis = 1)
data
#> Index reviews
0 0 Very user friendly interface and has 2F...
1 1 The trading page is great though with a...
2 2 Widge...
3 3 But it’s really only for serious trader...
4 4 The KYC and AML process is painful - it...
5 937 Legit pl...
6 938 Horrible customer service won’t get back ...
7 939 App is fast and r...
8 940 I wish it had a portfolio chart...
9 941 The app isn’t as user friendly as it need ...
data['words'] = list(map(lambda x: ", ".join(x), [re.findall('|'.join(features), x) for x in data.reviews]))
data
#> Index reviews words
0 0 Very user friendly interface and has 2F... user, support
1 1 The trading page is great though with a...
2 2 Widge... support
3 3 But it’s really only for serious trader...
4 4 The KYC and AML process is painful - it...
5 937 Legit pl... platform
6 938 Horrible customer service won’t get back ... custom, servic
7 939 App is fast and r... fast
8 940 I wish it had a portfolio chart... portfolio
9 941 The app isn’t as user friendly as it need ... user