Home > Software engineering >  Groupby and convert the first value to NaN
Groupby and convert the first value to NaN

Time:05-03

My df

   id  date    dummy
0  A   2019Q1    1
1  A   2019Q2    0
2  A   2019Q3    0
3  B   2019Q1    1
4  B   2019Q2    1
5  B   2019Q3    0

How can I groupby id and then convert the earliest value to NaN?

output

   id  date    dummy
0  A   2019Q1    NaN
1  A   2019Q2    0
2  A   2019Q3    0
3  B   2019Q1    NaN
4  B   2019Q2    1
5  B   2019Q3    0

CodePudding user response:

Use a boolean mask (assuming each rows are already sorted for each group):

df.loc[~df['id'].duplicated(), 'dummy'] = np.nan
print(df)

# Output
  id    date  dummy
0  A  2019Q1    NaN
1  A  2019Q2    0.0
2  A  2019Q3    0.0
3  B  2019Q1    NaN
4  B  2019Q2    1.0
5  B  2019Q3    0.0

Or:

df.loc[df.groupby('id').cumcount().eq(0), 'dummy'] = np.nan
print(df)

# Output
  id    date  dummy
0  A  2019Q1    NaN
1  A  2019Q2    0.0
2  A  2019Q3    0.0
3  B  2019Q1    NaN
4  B  2019Q2    1.0
5  B  2019Q3    0.0

CodePudding user response:

import pandas as pd

indices = df.reset_index().groupby("id")["index"].first().to_list()

df.loc[indices,'dummy'] = np.NaN

  • Related