Below is my data frame:
import pandas as pd
import numpy as np
data = {'product_name': ['laptop,computers,cde,ert,yhd,hngnvb,gfg,hghg,hfhfg,kjj,knkn,vcvc,hgethjhg,hgetuklhg,hghtr7kg,hgyilhg,huolghg,ejtrej6,it7u,wrgtetr,gre,rthrt,trhht,efwfe,rfyhjj,tuk,ryju,fyj,hik,hgryujahg,ryj,etuk', 'cde,ert,yhd,hngnvb,gfg,hghg,hfhfg,kjj,knkn,vcvc,hgethjhg,hgetuklhg,hghtr7kg,hgyilhg,huolghg,ejtrej6,it7u,wrgtetr,gre,rthrt,trhht,efwfe,rfyhjj,tuk,ryju,fyj,hik,hgryujahg,ryj,etuk', 'cde,ert,yhd,hngnvb,gfg,hghg,hfhfg,kjj,knkn,vcvc,hgethjhg,hgetuklhg,hghtr7kg,hgyilhg,huolghg,ejtrej6,it7u,wrgtetr,gre,rthrt,trhht,efwfe,rfyhjj,tuk,ryju,fyj,hik,hgryujahg,ryj,etuk', 'desk', 'cde,ert,yhd,hngnvb,gfg,hghg,hfhfg,kjj,knkn,vcvc,hgethjhg,hgetuklhg,hghtr7kg,hgyilhg,huolghg,ejtrej6,it7u,wrgtetr,gre,rthrt,trhht,efwfe,rfyhjj,tuk,ryju,fyj,hik,hello,ryj,etuk'],
'price': ['lhghtr7kg', 'printer', 'mobile', 'desk', 'HELLO']
}
df = pd.DataFrame(data)
I want to find values of column 'product_name' in 'price' column and if it is matched then in new column return as match value. I tried contain method but it is not giving what I want.
CodePudding user response:
Use a loop here. Assuming full words are wanted, you can split
.
The fastest will be a list comprehension:
df['match'] = [b.casefold() in a.casefold().split(',')
for a,b in zip(df['product_name'], df['price'])]
output:
product_name price match
0 laptop,computers,cde,ert,yhd,hngnvb,gfg,hghg,h... lhghtr7kg False
1 cde,ert,yhd,hngnvb,gfg,hghg,hfhfg,kjj,knkn,vcv... printer False
2 cde,ert,yhd,hngnvb,gfg,hghg,hfhfg,kjj,knkn,vcv... mobile False
3 desk desk True
4 cde,ert,yhd,hngnvb,gfg,hghg,hfhfg,kjj,knkn,vcv... HELLO True
If you rather want the match(ed) word(s), go with a regex:
import re
df['match'] = [m.group() if (m:=re.search(fr'\b{re.escape(b)}\b', a, flags=re.I)) else None
for a,b in zip(df['product_name'], df['price'])]
output:
product_name price match
0 laptop,computers,cde,ert,yhd,hngnvb,gfg,hghg,h... lhghtr7kg None
1 cde,ert,yhd,hngnvb,gfg,hghg,hfhfg,kjj,knkn,vcv... printer None
2 cde,ert,yhd,hngnvb,gfg,hghg,hfhfg,kjj,knkn,vcv... mobile None
3 desk desk desk
4 cde,ert,yhd,hngnvb,gfg,hghg,hfhfg,kjj,knkn,vcv... HELLO hello
CodePudding user response:
You can apply with in:
df['C'] = df.apply(lambda x: x.product_name in x.price, axis=1)
output
product_name price C
0 laptop,computers,cde,ert,yhd,hngnvb,gfg,hghg,h... lhghtr7kg False
1 cde,ert,yhd,hngnvb,gfg,hghg,hfhfg,kjj,knkn,vcv... printer False
2 cde,ert,yhd,hngnvb,gfg,hghg,hfhfg,kjj,knkn,vcv... mobile False
3 desk desk True
4 cde,ert,yhd,hngnvb,gfg,hghg,hfhfg,kjj,knkn,vcv... HELLO False
sol 2:
list comprehension
df['result'] = [x[0] in x[1] for x in zip(df['product_name'], df['price'])]
output
product_name price result
0laptop,computers,cde,ert,yhd,hngnvb,gfg,hghg,h... lhghtr7kg False
1 cde,ert,yhd,hngnvb,gfg,hghg,hfhfg,kjj,knkn,vcv... printer False
2 cde,ert,yhd,hngnvb,gfg,hghg,hfhfg,kjj,knkn,vcv... mobile False
3 desk desk True
4 cde,ert,yhd,hngnvb,gfg,hghg,hfhfg,kjj,knkn,vcv... HELLO False
CodePudding user response:
Code:
df['match'] = df.apply(lambda x: x.product_name if x.price in list(x.product_name.split(',')) else None, axis=1)
df['match]
Output:
0 None
1 None
2 None
3 desk
4 None
Name: match, dtype: object
CodePudding user response:
list_values = []
for iten in df['product_name']:
if iten in df.price.values:
list_values.append('equal')
else:
list_values.append('different')
df['match'] = list_values
print(df)