If I have a dataframe like this :
date A B C
01.01.2003 01.01.2003
02.01.2003
03.01.2003 03.01.2003
05.01.2003 05.01.2003
06.01.2003 06.01.2003
08.01.2003 08.01.2003 08.01.2003 08.01.2003
And I want to change if value in column A, B, C are all equal I want to delete value in column A and B, leave the C. so the output
date A B C
01.01.2003 01.01.2003
02.01.2003
03.01.2003 03.01.2003
05.01.2003 05.01.2003
06.01.2003 06.01.2003
08.01.2003 08.01.2003
I applied np.where but the error says condition does not apply on timestamp
np.where((df['A'] & df ['B'] == df['C]'),
df['A'] & df['B], '')
thanks for the lead
CodePudding user response:
You can use pandas.DataFrame.loc
with two conditions on row selection, namely A=B and B=C, and assign [None] to both A and B fields.
df.loc[(df['A']==df['B']) & (df['B']==df['C']), ['A', 'B']] = [[None, None]]
Output
date A B C
0 01.01.2003 01.,01.2003 None None
1 02.01.2003 None None None
2 03.01.2003 None 03.01.2003 None
3 05.01.2003 05.01.2003 None None
4 06.01.2003 None 06.01.2003 None
5 08.01.2003 None None 08.01.2003
Check the demo here.
CodePudding user response:
Use boolean indexing with help of all
:
df.loc[df[['A', 'B']].eq(df['C'], axis=0).all(axis=1), ['A', 'B']] = np.nan
Output:
date A B C
0 01.01.2003 01.01.2003 None None
1 02.01.2003 None None None
2 03.01.2003 NaN 03.01.2003 None
3 05.01.2003 05.01.2003 None None
4 06.01.2003 NaN 06.01.2003 None
5 08.01.2003 NaN NaN 08.01.2003