Home > other >  How do I drop values of a specific condition and replace those?
How do I drop values of a specific condition and replace those?

Time:08-25

My dataframe looks like this:

ID first second fourth fifth
1 one one Two Three
2 one Two Two Three
3 one Three Three Three
4 one one one one
5 one one two one

Code:

df = {'ID': [1, 2, 3, 4, 5],
        'first': ['one', 'one', 'one', 'one', 'one']
'second': ['one', 'two', 'three','one','one']
'fourth': ['two', 'two', 'three','one','two']
'fifth': ['three','three','three','one', 'one']
        }

I want to drop/delete those values in one row that appear in the next column (right) as well. So there are a lot of duplicates but if there is another value between one same value like in "ID" 5, then just the value of the second column should be deleted, so that the df looks like this in the end:

ID first second fourth fifth
1 one Two Three NaN
2 one Two Three NaN
3 one Three NaN NaN
4 one NaN NaN NaN
5 one two one NaN

CodePudding user response:

You can use drop_duplicates per row and reindex:

out = (df
 .set_index('ID')
 .apply(lambda s: (s2:=s.drop_duplicates())
                  .set_axis(s.index[:len(s2)]),
        axis=1)
 .reset_index().reindex(df.columns, axis=1)
)

output:

   ID first second fourth  fifth
0   1   one    Two  Three    NaN
1   2   one    Two  Three    NaN
2   3   one  Three    NaN    NaN
3   4   one    NaN    NaN    NaN
4   5   one    two    NaN    NaN

CodePudding user response:

You can just do shift then use the NaN replace the same

out = df.where(lambda x : df.ne(df.shift(1,axis=1))).transform(lambda x: sorted(x, key=pd.isnull),1)
Out[73]: 
  ID first second fourth fifth
0  1   one    Two  Three   NaN
1  2   one    Two  Three   NaN
2  3   one  Three    NaN   NaN
3  4   one    NaN    NaN   NaN
4  5   one    two    one   NaN
  • Related