Home > front end >  Get name of the column with last value in the row
Get name of the column with last value in the row

Time:12-04

I am doing cohort analysis and the dataset I'm using has 15 months as the name as columns with revenue and around 7k user_id rows. I need to get a new column with the month when the user was last time active.

2021-01-01 2021-02-01
3456. Nan
Nan. 8679

Result should be like this

2021-01-01 2021-02-01 Last_month
3456. Nan 2021-01-01
Nan. 8679 2021-02-01

I have tried few options but it didnt work

users.apply(pd.Series.last_valid_index)

CodePudding user response:

using a boolean and idxmax() might be the solution here

df['last_month'] = (~df.isna()).idxmax(axis=1)


print(df)


  2021-01-01 2021-02-01  last_month
0       3456        NaN  2021-01-01
1        NaN       8679  2021-02-01

CodePudding user response:

Example

data = {'2021-01-01': {0: 3456, 1: None}, '2021-02-01': {0: None, 1: 8679}}
df = pd.DataFrame(data)

df

    2021-01-01  2021-02-01
0   3456.0      NaN
1   NaN         8679.0

Code

df.apply(lambda x: x.last_valid_index(), axis=1)

output:

0    2021-01-01
1    2021-02-01
dtype: object

full Code

make output to Last_month column

df.assign(Last_month=df.apply(lambda x: x.last_valid_index(), axis=1))

result:

    2021-01-01  2021-02-01  Last_month
0   3456.0      NaN         2021-01-01
1   NaN         8679.0      2021-02-01
  • Related