Dataframe is like this:
RS AS IS
F1 [F1, F2, F3, F4, F5] [F1] [F1]
F2 [F2, F3, F5] [F1, F2, F3, F5] [F5, F3, F2]
F3 [F2, F3, F4, F5] [F1, F2, F3, F5] [F5, F3, F2]
F4 [F4] [F1, F3, F4, F5] [F4]
F5 [F2, F3, F4, F5] [F1, F2, F3, F5] [F5, F3, F2]
Output I need:
RS AS IS Level
F1 [F1, F2, F3, F4, F5] [F1] [F1]
F2 [F2, F3, F5] [F1, F2, F3, F5] [F5, F3, F2] I
F3 [F2, F3, F4, F5] [F1, F2, F3, F5] [F5, F3, F2]
F4 [F4] [F1, F3, F4, F5] [F4] I
F5 [F2, F3, F4, F5] [F1, F2, F3, F5] [F5, F3, F2]
The logic is very simple. If RS and IS is having similar values then write I
in Level column.
I am using the following code but looks like it doesn't work.
if df['RS'].any() == df['IS'].any():
df['Level'] = 'I'
Also need to drop the entire row having level 'I' from original Dataframe after above method is implemented. like this
RS AS IS
F1 [F1, F2, F3, F4, F5] [F1] [F1]
F3 [F2, F3, F4, F5] [F1, F2, F3, F5] [F5, F3, F2]
F5 [F2, F3, F4, F5] [F1, F2, F3, F5] [F5, F3, F2]
CodePudding user response:
Convert your lists to set
and then comparing for equality to get which rows have the same elements, then assign the value. The example below ignores your middle column.
import pandas as pd
df = pd.DataFrame({'RS':
[[1,2,3,4,5],
[2,3,5],
[2,3,4,5],
[4],
[2,3,4,5]],
'IS':
[[1],
[5,3,2],
[5,3,2],
[4],
[5,3,2]]})
ix = df.RS.apply(set) == df.IS.apply(set)
df['Level'] = ''
df.loc[ix, 'Level'] = 'I'
df:
# returns:
RS IS Level
[1, 2, 3, 4, 5] [1]
[2, 3, 5] [5, 3, 2] I
[2, 3, 4, 5] [5, 3, 2]
[4] [4] I
[2, 3, 4, 5] [5, 3, 2]
If you need to drop the rows where I
would be assigned; you don't actually need to assign I
, just use:
ix = df.RS.apply(set) == df.IS.apply(set)
df.loc[~ix]