Home > Software engineering >  Match the sets between two column in a row
Match the sets between two column in a row

Time:10-21

I have table with two columns:

Col_1 Col_2
3,4,5 3,5
8,9,12 8,9,12
8,9,12

I need to check Col_1 with Col_2 and if an element in Col_1 is present in Col_2 then save True, else save False.

The output table should be:

Col_1 Col_2 Check
3,4,5 3,5 True,False,True
8,9,12 8,9,12 True,True,True
8,9,12 False,False,False

CodePudding user response:

Use lsit comprehension with test splitted values:

df['check'] = [[z in y for z in x] for x, y in zip(df['Col_1'].fillna('').str.split(','), 
                                                   df['Col_2'].fillna('').str.split(','))]
print (df)
    Col_1   Col_2                  check
0   3,4,5     3,5    [True, False, True]
1  8,9,12  8,9,12     [True, True, True]
2  8,9,12    None  [False, False, False]

Another solution with DataFrame.explode and test valeus per groups:

df['check'] = (df.assign(Col_1 = df['Col_1'].fillna('').str.split(','), 
                         Col_2 = df['Col_2'].fillna('').str.split(','))
                 .explode('Col_1')
                 .groupby(level=0)
                 .apply(lambda x: list(x.Col_1.isin(x.Col_2.iat[0]))))
print (df)
    Col_1   Col_2                  check
0   3,4,5     3,5    [True, False, True]
1  8,9,12  8,9,12     [True, True, True]
2  8,9,12    None  [False, False, False]

CodePudding user response:

import pandas as pd

df.fillna("", inplace=True)

df["Check"] = df.apply(
 lambda x: pd.Series(x["Col_1"].split(",")).isin(x["Col_2"].split(",")).values, axis=1
)
  • Related