Home > Mobile >  How to drop last not NaN value in Dataframe?
How to drop last not NaN value in Dataframe?

Time:10-12

a1 a2 a3 Last_Not_NaN_Value
1 NaN NaN 1
0 0 NaN 0
NaN 5 NaN 5

I've managed so far to get last not NaN value in the row this way:

data.ffill(axis=1).iloc[:, -1] 

But, I also need to replace that value with NaN (drop it from the DataFrame)

CodePudding user response:

Create a boolean mask to identify non-nan values, then calculate cumsum along axis=1 then mask the values in original dataframe where cumsum is maximum

m = df.notna()
s = m.cumsum(1)
df.mask(s.eq(s.max(1), axis=0))

    a1  a2  a3
0  NaN NaN NaN
1  0.0 NaN NaN
2  NaN NaN NaN

PS: There is no need to create an intermediate column Last_Not_NaN_Value

CodePudding user response:

one way is to use last_valid_index on each row:

df = df[['a1', 'a2', 'a3']] #just in case
for i, r in df.iterrows():
    df.loc[i, r.last_valid_index()] = np.nan

CodePudding user response:

import pandas as pd

seq = (
    df    # set index and column values by their ordinal numbers
    .set_axis(range(df.shape[0]), axis=0)
    .set_axis(range(df.shape[1]), axis=1)
    .agg(pd.DataFrame.last_valid_index, 1)
)
df.values[seq.index, seq] = pd.NA

Here

  • df is a given data frame;
  • seq - associate rows with a corresponding last valid column number;
  • df.values is a numpy.array and it's a view to the values of df
  • values[seq.index, seq] is Integer array indexing, which allows selection of arbitrary items in df (it's a view to the original data, so we can use assigning to change those values).
  • Related