Comparing value present in row and column name

I have a dataframe as shown below:

                     A          B      0      3     4     5     8
timestamp       
2022-05-09 09:28:00  0          45     NaN   20    30    NaN    NaN
2022-05-09 09:28:01  3          100    NaN   20    NaN   30     20   
2022-05-09 09:28:02  4          30     NaN   NaN   10    NaN    40
2022-05-09 09:28:03  5          20     NaN   NaN   20    90     NaN
2022-05-09 09:28:04  8          10     NaN   NaN   10    30     NaN

The values present in A that is 0,3,4,5 and 8 are present as columns in the dataframe. The idea is in each row, value present in column A is noted and if the corresponding column has a value in it, it must be changed to NaN. For eg: In the second row of the dataframe, the value in column A is 3, so for the same row, the column 3 is checked, if it has a value, in this case 20, it must be changed to NaN (as shown below)

                     A          B      0      3     4     5     8
timestamp       
2022-05-09 09:28:00  0          45     NaN   20    30    NaN    NaN
2022-05-09 09:28:01  3          100    NaN   NaN   NaN   30     20   
2022-05-09 09:28:02  4          30     NaN   NaN   NaN   NaN    40
2022-05-09 09:28:03  5          20     NaN   NaN   20    NaN    NaN
2022-05-09 09:28:04  8          10     NaN   NaN   10    30     NaN

Is there a function in Pandas to do this directly?

Thanks in advance!

CodePudding user response：

If you have set columns you can use this update method. I have a simplified sample dataframe below

for i in [1,2,3]:
  df.loc[df['colval']==i,i]=None

Create dataframe

df=pd.DataFrame({'colval':[1,2,3],1:[1,2,4],2:[1,1,1],3:[1,2,3]})

initial dataframe

   colval    1  2  3
0       1    1  1  1
1       2    2  1  2
2       3    4  1  3

output

   colval    1    2    3
0       1  NaN  1.0  1.0
1       2  2.0  NaN  2.0
2       3  4.0  1.0  NaN

Applied this solution to your situation by looping through your columns and applying Conditional Update in Pandas

CodePudding user response：

You can try DataFrame.apply on rows

import numpy as np

df = df.apply(lambda row: row.mask(row.index == str(int(row['A'])), np.nan), axis=1)
#                                               ^^^ Depending your on actual data type, `str` is optional

print(df)

                       A      B   0     3     4     5     8
timestamp
2022-05-09 09:28:00  0.0   45.0 NaN  20.0  30.0   NaN   NaN
2022-05-09 09:28:01  3.0  100.0 NaN   NaN   NaN  30.0  20.0
2022-05-09 09:28:02  4.0   30.0 NaN   NaN   NaN   NaN  40.0
2022-05-09 09:28:03  5.0   20.0 NaN   NaN  20.0   NaN   NaN
2022-05-09 09:28:04  8.0   10.0 NaN   NaN  10.0  30.0   NaN