Home > database >  How to compare each item of a list in a pandas dataframe with another column of the same DF?
How to compare each item of a list in a pandas dataframe with another column of the same DF?

Time:10-29

I have this dataframe in Pandas and I wanna to do this:

Original Dataframe:

|         column_1          | column_2 |
| ------------------------- | -------- |
| [value_1,value_2,value_3] | value_2  |
| [value_1,value_2,value_3] | value_3  | 
| [value_1,value_2,value_3] | value_7  |

Desired dataframe:

|        column_1           | column_2 | flag |
| ------------------------- | -------- | ---- |
| [value_1,value_2,value_3] | value_2  | true |
| [value_1,value_2,value_3] | value_3  | true |
| [value_1,value_2,value_3] | value_7  | false|

It should be noted that all the columns are of type object, but I would need to go through the list of column_1 to compare if any of the values exist in column_2.

CodePudding user response:

You can use pandas.apply with axis=1 for check value in each row of column_2 with values in the list of column_1.

df['flag'] = df.apply(lambda row: row['column_2'] in row['column_1'], axis=1)
print(df)

Or You can use zip:

df['flag'] = [x in y for x, y in zip(df['column_2'], df['column_1'])]
print(df)

Output:

                      column_1 column_2   flag
0  [value_1, value_2, value_3]  value_2   True
1  [value_1, value_2, value_3]  value_3   True
2  [value_1, value_2, value_3]  value_7  False

Input DataFrame:

import pandas as pd

df = pd.DataFrame({
    'column_1' : [["value_1","value_2","value_3"], 
                  ["value_1","value_2","value_3"],
                  ["value_1","value_2","value_3"]
                 ],
    'column_2' : ['value_2', 'value_3', 'value_7']
})

print(df['column_1'])

# 0    ["value_1","value_2","value_3"]
# 1    ["value_1","value_2","value_3"]
# 2    ["value_1","value_2","value_3"]
# Name: column_1, dtype: object
  • Related