Sorry I cant figure out how to drop duplicate values but horizontal. The function drop_duplicate
does not have a index parameter.
So i have one dataframe
contact | phone1 | phone2 | phone3 | phone4 |
---|---|---|---|---|
1 | 1234 | 1234 | ||
2 | 12345 | 12345 |
And I want to have the following dataframe:
contact | phone1 | phone2 | phone3 | phone4 |
---|---|---|---|---|
1 | 1234 | |||
2 | 12345 |
CodePudding user response:
Option 1: Can use stack and reindex
df.stack().drop_duplicates().unstack().reindex(columns=df.columns).fillna('')
contact phone1 phone2 phone3 phone4
0 1.0 1234.0
1 2.0 12345.0
Option2:To replace duplicates row wise, please try first NaN with 0 (a value that does not exist in the df). Mask the duplicates and fill them with '', then replace 0 with NaN to restore the df
df.fillna(0).mask(df.apply(lambda x: x.duplicated(), axis=1)).fillna('').replace(0,np.nan)
contact phone1 phone2 phone3 phone4
0 1 1234.0 NaN
1 2 NaN 12345.0
Option 3:
To achieve your outcome as put. We can just:
df.mask(df.apply(lambda x: x.duplicated(), axis=1)).fillna('')
contact phone1 phone2 phone3 phone4
0 1.0 1234.0
1 2.0 12345.0