Home > Software design >  Overwrite values to NaN based on conditions (different for each column)
Overwrite values to NaN based on conditions (different for each column)

Time:07-07

I have the following dataframe:

                  A        B        C
2019-10-10  110.375  111.475  113.600
2019-10-11  111.125  112.325  114.225
2019-10-14  111.875  113.150  115.450
2019-10-15  112.175  113.500  115.600

For each column, I have a cutoff date (a date after which I don't need data for that column):

  Col      Cutoff
0   A  2019-10-10
1   B  2019-10-14
2   C  2019-10-15

For each column of the df, I want to change cells values to NaN if the date (the index of the df) is after the cutoff date for that column. This is the expected output:

                  A        B        C
2019-10-10  110.375  111.475  113.600
2019-10-11      NaN  112.325  114.225
2019-10-14      NaN  113.150  115.450
2019-10-15      NaN      NaN  115.600

I guess I could do this with a loop column by column, but maybe there's a more efficient way?

CodePudding user response:

Use Index.repeat for helper DataFrame with same columns like original, then comapre for greater by DataFrame.gt and set NaNs by DataFrame.mask:

s = df1.set_index('Col')['Cutoff']

lens = len(df.columns)
m = pd.DataFrame(df.index.repeat(lens).to_numpy().reshape(-1, lens),
                 index=df.index, 
                 columns=df.columns).gt(s)

df = df.mask(m)
print (df)
                  A        B        C
2019-10-10  110.375  111.475  113.600
2019-10-11      NaN  112.325  114.225
2019-10-14      NaN  113.150  115.450
2019-10-15      NaN      NaN  115.600
  • Related