Home > Blockchain >  Convert all values after certain value row by row in a Dataframe Pandas
Convert all values after certain value row by row in a Dataframe Pandas

Time:09-20

I have a dataframe like this:

import pandas as pd
import numpy as np

data = [['A', 0, 0, 0, 0], ['B', 0, 'X', 0, 0], ['C', 'X', 0, 0, 0], ['D', 0, 0, 0, 'X'], ['E', 0, 0, 'X', 0]]
df = pd.DataFrame(data, columns=['GROUP', 'V1', 'V2', 'V3', 'V4'])

  GROUP V1 V2 V3 V4
0     A  0  0  0  0
1     B  0  X  0  0
2     C  X  0  0  0
3     D  0  0  0  X
4     E  0  0  X  0

I would like to convert all values after the X to NaN row by row. Here is the expected output:

data = [['A', 0, 0, 0, 0], ['B', 0, 'X', np.NaN, np.NaN], ['C', 'X', np.NaN, np.NaN, np.NaN], ['D', 0, 0, 0, 'X'], ['E', 0, 0, 'X', np.NaN]]
df_desired = pd.DataFrame(data, columns=['GROUP', 'V1', 'V2', 'V3', 'V4'])

  GROUP V1   V2   V3   V4
0     A  0    0    0    0
1     B  0    X  NaN  NaN
2     C  X  NaN  NaN  NaN
3     D  0    0    0    X
4     E  0    0    X  NaN

So I was wondering if it is possible to replace these values after the X using pandas?

CodePudding user response:

In [90]: df.mask(df.eq("X").cummax(axis=1).shift(1, axis=1, fill_value=False))
Out[90]:
  GROUP V1   V2   V3   V4
0     A  0    0    0    0
1     B  0    X  NaN  NaN
2     C  X  NaN  NaN  NaN
3     D  0    0    0    X
4     E  0    0    X  NaN
  • check if equal to X frame-wise; gives True/False frame
  • take the cumulative maximum row-wise; once True, always True
  • shift by 1 to not include X's themselves (fill_value=False for the gap in leftest pos after shift)
  • mask where True, i.e., put NaN where it's after X

CodePudding user response:

Let us do

df.where(df.ne('X').cumprod(1).shift(axis=1).ne(0))
Out[27]: 
  GROUP V1   V2   V3   V4
0     A  0    0    0    0
1     B  0    X  NaN  NaN
2     C  X  NaN  NaN  NaN
3     D  0    0    0    X
4     E  0    0    X  NaN
  • Related