Home > Software engineering >  I want to check one column values are present into another column
I want to check one column values are present into another column

Time:09-06

Below is my data frame:

import pandas as pd

import numpy as np


data = {'product_name': ['laptop,computers,cde,ert,yhd,hngnvb,gfg,hghg,hfhfg,kjj,knkn,vcvc,hgethjhg,hgetuklhg,hghtr7kg,hgyilhg,huolghg,ejtrej6,it7u,wrgtetr,gre,rthrt,trhht,efwfe,rfyhjj,tuk,ryju,fyj,hik,hgryujahg,ryj,etuk', 'cde,ert,yhd,hngnvb,gfg,hghg,hfhfg,kjj,knkn,vcvc,hgethjhg,hgetuklhg,hghtr7kg,hgyilhg,huolghg,ejtrej6,it7u,wrgtetr,gre,rthrt,trhht,efwfe,rfyhjj,tuk,ryju,fyj,hik,hgryujahg,ryj,etuk', 'cde,ert,yhd,hngnvb,gfg,hghg,hfhfg,kjj,knkn,vcvc,hgethjhg,hgetuklhg,hghtr7kg,hgyilhg,huolghg,ejtrej6,it7u,wrgtetr,gre,rthrt,trhht,efwfe,rfyhjj,tuk,ryju,fyj,hik,hgryujahg,ryj,etuk', 'desk', 'cde,ert,yhd,hngnvb,gfg,hghg,hfhfg,kjj,knkn,vcvc,hgethjhg,hgetuklhg,hghtr7kg,hgyilhg,huolghg,ejtrej6,it7u,wrgtetr,gre,rthrt,trhht,efwfe,rfyhjj,tuk,ryju,fyj,hik,hello,ryj,etuk'],
        'price': ['lhghtr7kg', 'printer', 'mobile', 'desk', 'HELLO']
        }

df = pd.DataFrame(data)

I want to find values of column 'product_name' in 'price' column and if it is matched then in new column return as match value. I tried contain method but it is not giving what I want.

CodePudding user response:

Use a loop here. Assuming full words are wanted, you can split.

The fastest will be a list comprehension:

df['match'] = [b.casefold() in a.casefold().split(',')
               for a,b in zip(df['product_name'], df['price'])]

output:

                                        product_name      price  match
0  laptop,computers,cde,ert,yhd,hngnvb,gfg,hghg,h...  lhghtr7kg  False
1  cde,ert,yhd,hngnvb,gfg,hghg,hfhfg,kjj,knkn,vcv...    printer  False
2  cde,ert,yhd,hngnvb,gfg,hghg,hfhfg,kjj,knkn,vcv...     mobile  False
3                                               desk       desk   True
4  cde,ert,yhd,hngnvb,gfg,hghg,hfhfg,kjj,knkn,vcv...      HELLO   True

If you rather want the match(ed) word(s), go with a regex:

import re
df['match'] = [m.group() if (m:=re.search(fr'\b{re.escape(b)}\b', a, flags=re.I)) else None
               for a,b in zip(df['product_name'], df['price'])]

output:

                                        product_name      price  match
0  laptop,computers,cde,ert,yhd,hngnvb,gfg,hghg,h...  lhghtr7kg   None
1  cde,ert,yhd,hngnvb,gfg,hghg,hfhfg,kjj,knkn,vcv...    printer   None
2  cde,ert,yhd,hngnvb,gfg,hghg,hfhfg,kjj,knkn,vcv...     mobile   None
3                                               desk       desk   desk
4  cde,ert,yhd,hngnvb,gfg,hghg,hfhfg,kjj,knkn,vcv...      HELLO  hello

CodePudding user response:

You can apply with in:

df['C'] = df.apply(lambda x: x.product_name in x.price, axis=1)

output

    product_name                                         price   C

0   laptop,computers,cde,ert,yhd,hngnvb,gfg,hghg,h...   lhghtr7kg   False
1   cde,ert,yhd,hngnvb,gfg,hghg,hfhfg,kjj,knkn,vcv...   printer False
2   cde,ert,yhd,hngnvb,gfg,hghg,hfhfg,kjj,knkn,vcv...   mobile  False
3   desk    desk                                                True
4   cde,ert,yhd,hngnvb,gfg,hghg,hfhfg,kjj,knkn,vcv...   HELLO   False

sol 2:

list comprehension

df['result'] = [x[0] in x[1] for x in zip(df['product_name'], df['price'])]

output

product_name                                         price      result                                
0laptop,computers,cde,ert,yhd,hngnvb,gfg,hghg,h...  lhghtr7kg   False
1   cde,ert,yhd,hngnvb,gfg,hghg,hfhfg,kjj,knkn,vcv...   printer False
2   cde,ert,yhd,hngnvb,gfg,hghg,hfhfg,kjj,knkn,vcv...   mobile  False
3   desk                                                desk                                                True
4   cde,ert,yhd,hngnvb,gfg,hghg,hfhfg,kjj,knkn,vcv...   HELLO   False
​

CodePudding user response:

Code:

df['match'] = df.apply(lambda x: x.product_name if x.price in list(x.product_name.split(',')) else None, axis=1)
df['match]

Output:

0    None
1    None
2    None
3    desk
4    None
Name: match, dtype: object

CodePudding user response:

list_values = []
for iten in df['product_name']:
    if iten in df.price.values:
        list_values.append('equal')
    else:
        list_values.append('different')

df['match'] = list_values
print(df)
  • Related