Groupby id and change values for all rows for the earliest date to NaN-CodePudding

I have the following id, i would like to groupby id and then replace value X with NaN. My current df.


 ID    Date        X      other variables..
 1   1/1/18   0.118758835
 1   1/1/18   0.148103273 
 1   1/1/18   0.365541214 
 1   1/2/18   0.405002687 
 1   1/2/18   0.130580643
 1   1/2/18   0.395113106 
 2   1/1/18   0.425580038 
 2   1/1/18   0.889677796 
 2   1/1/18   0.835311629
 2   1/2/18   0.8375818   
 2   1/2/18   0.648162611 
 2   1/2/18   0.639060695

desired output


 ID    Date        X      other variables..
 1   1/1/18       NaN
 1   1/1/18       NaN
 1   1/1/18       NaN
 1   1/2/18   0.405002687 
 1   1/2/18   0.130580643
 1   1/2/18   0.395113106 
 2   1/1/18       NaN
 2   1/1/18       NaN
 2   1/1/18       NaN
 2   1/2/18   0.8375818   
 2   1/2/18   0.648162611 
 2   1/2/18   0.639060695

CodePudding user response：

You can call min in groupby.transform to get the earliest dates for each ID; then compare it with "Date" to get a boolean mask; finally use the mask to mask earliest "X"s:

df['X'] = df['X'].mask(df.groupby('ID')['Date'].transform('min').eq(df['Date']))

Output:

    ID    Date         X
0    1  1/1/18       NaN
1    1  1/1/18       NaN
2    1  1/1/18       NaN
3    1  1/2/18  0.405003
4    1  1/2/18  0.130581
5    1  1/2/18  0.395113
6    2  1/1/18       NaN
7    2  1/1/18       NaN
8    2  1/1/18       NaN
9    2  1/2/18  0.837582
10   2  1/2/18  0.648163
11   2  1/2/18  0.639061

CodePudding user response：

You can try

m = (df.groupby('ID')
     .apply(lambda g: g['Date'].eq(pd.to_datetime(df['Date']).min().strftime('%-m/%-d/%y')))
     .reset_index(drop=True))

df.loc[m, 'X'] = pd.NA

print(df)

    index  ID    Date         X         Y Category
0       0   1  1/1/18      <NA>  0.954677        A
1       1   1  1/1/18      <NA>  0.976618        B
2       2   1  1/1/18      <NA>  0.551642        C
3       3   1  1/2/18  0.405003  0.343279        A
4       4   1  1/2/18  0.130581  0.144487        B
5       5   1  1/2/18  0.395113  0.113119        C
6       6   2  1/1/18      <NA>  0.725166        A
7       7   2  1/1/18      <NA>  0.386824        B
8       8   2  1/1/18      <NA>  0.363803        C
9       9   2  1/2/18  0.837582  0.769266        A
10     10   2  1/2/18  0.648163  0.075286        B
11     11   2  1/2/18  0.639061  0.791222        C