I have a DataFrame:
RR AA SS LL
C1 [C1, C2, C3, C4, C5] [C1] [C1]
C2 [C2, C3, C5] [C1, C2, C3, C5] [C5, C3, C2] I
C3 [C2, C3, C4, C5] [C1, C2, C3, C5] [C5, C3, C2]
C4 [C4] [C1, C3, C4, C5] [C4] I
C5 [C2, C3, C4, C5] [C1, C2, C3, C5] [C5, C3, C2]
I want to delete the entire row having LL I
i.e., rows C2
and C4
Also need to delete the elements C2
and C4
from the remaining rows lists in RR
, AA
and SS
so that the output should be like this:
RR AA SS LL
C1 [C1, C3, C5] [C1] [C1]
C3 [C3, C5] [C1, C3, C5] [C5, C3]
C5 [C3, C5] [C1, C3, C5] [C5, C3]
I tried this code but it only deletes the rows not C2
and C4
from list elements in RR
, AA
and SS
.
ix = df.RS.apply(set) == df.IS.apply(set)
df.loc[~ix]
I am getting output like this where in RR
, AA
and SS
, C2
and C4
are present in their lists which I don't need.
RR AA SS LL
C1 [C1, C2, C3, C4, C5] [C1] [C1]
C3 [C2, C3, C4, C5] [C1, C2, C3, C5] [C5, C3, C2]
C5 [C2, C3, C4, C5] [C1, C2, C3, C5] [C5, C3, C2]
CodePudding user response:
This should do it:
new_df = df.loc[df['LL'] != 'I', ['RR', 'AA', 'SS']].applymap(set).apply(lambda col: col - {'C2', 'C4'}).applymap(list)
Output:
>>> new_df
RR AA SS
C1 {C5, C3, C1} {C1} {C1}
C3 {C5, C3} {C1, C5, C3} {C5, C3}
C5 {C5, C3} {C1, C5, C3} {C5, C3}
CodePudding user response:
col1 = ['C1','C2','C3','C4','C5']
RR = [['C1', 'C2', 'C3', 'C4', 'C5'], ['C2', 'C3', 'C5'], ['C2', 'C3', 'C4', 'C5'],
['C4'], ['C2', 'C3', 'C4', 'C5']]
AA = [['C1'], ['C1', 'C2', 'C3', 'C5'], ['C1', 'C2', 'C3', 'C5'], ['C1', 'C3', 'C4', 'C5'],
['C1', 'C2', 'C3', 'C5']]
SS = [['C1'], ['C5', 'C3', 'C2'], ['C5', 'C3', 'C2'], ['C4'], ['C5', 'C3', 'C2']]
LL = ['','I','','I','']
df1 = pd.DataFrame({'col1':col1, 'RR':RR,'AA':AA, 'SS':SS, 'LL':LL})
removing_row = df1.loc[df1['LL'] == 'I', 'col1']
removing_index = list(removing_row.index)
removing_values = removing_row.values
df1.drop(df1.index[removing_index], inplace=True, axis=0)
for col in ['RR','AA','SS']:
for i,j in df1[col].iteritems():
for k in removing_values:
if k in j:
j.remove(k)
df1[col][i] = j
print(df1)