I have this dataframe in Pandas and I wanna to do this:
Original Dataframe:
| column_1 | column_2 |
| ------------------------- | -------- |
| [value_1,value_2,value_3] | value_2 |
| [value_1,value_2,value_3] | value_3 |
| [value_1,value_2,value_3] | value_7 |
Desired dataframe:
| column_1 | column_2 | flag |
| ------------------------- | -------- | ---- |
| [value_1,value_2,value_3] | value_2 | true |
| [value_1,value_2,value_3] | value_3 | true |
| [value_1,value_2,value_3] | value_7 | false|
It should be noted that all the columns are of type object, but I would need to go through the list of column_1 to compare if any of the values exist in column_2.
CodePudding user response:
You can use pandas.apply
with axis=1
for check value in each row of column_2
with values in the list of column_1
.
df['flag'] = df.apply(lambda row: row['column_2'] in row['column_1'], axis=1)
print(df)
Or You can use zip
:
df['flag'] = [x in y for x, y in zip(df['column_2'], df['column_1'])]
print(df)
Output:
column_1 column_2 flag
0 [value_1, value_2, value_3] value_2 True
1 [value_1, value_2, value_3] value_3 True
2 [value_1, value_2, value_3] value_7 False
Input DataFrame:
import pandas as pd
df = pd.DataFrame({
'column_1' : [["value_1","value_2","value_3"],
["value_1","value_2","value_3"],
["value_1","value_2","value_3"]
],
'column_2' : ['value_2', 'value_3', 'value_7']
})
print(df['column_1'])
# 0 ["value_1","value_2","value_3"]
# 1 ["value_1","value_2","value_3"]
# 2 ["value_1","value_2","value_3"]
# Name: column_1, dtype: object