How to create a flag column indicating change based on the change in value from the previous value t-CodePudding

I have a data frame that is sorted by CLIENT_ID and ENCOUNTER_DATE as shown below:

CLIENT_ID	ENCOUNTER_DATE	STAGE
8222	2020-01-01	1
8222	2020-03-02	1
8222	2020-04-18	2
8222	2020-07-31	1
8300	2017-06-10	1
8300	2017-09-11	2
8300	2018-02-01	2
8300	2018-04-01	3
8300	2018-05-31	4
8400	2020-12-31	1
8401	2017-08-29	1
8401	2017-09-15	3
8500	2018-10-10	2

How can I create a new column (flag column) which indicates whether or not the STAGE from the previous DATE_ENCOUNTER for each CLIENT_ID is greater than the current DATE_ENCOUNTER which would result in the table below:

CLIENT_ID	ENCOUNTER_DATE	STAGE	STAGE_WORSENED
8222	2020-01-01	1	0
8222	2020-03-02	1	0
8222	2020-04-18	2	1
8222	2020-07-31	1	0
8300	2017-06-10	1	0
8300	2017-09-11	2	1
8300	2018-02-01	2	0
8300	2018-04-01	3	1
8300	2018-05-31	4	1
8400	2020-12-31	1	0
8401	2017-08-29	1	0
8401	2017-09-15	3	1
8500	2018-10-10	2	0

Here is the code to generate the df:

df = pd.DataFrame({"CLIENT_ID": [8222, 8222, 8222, 8222, 8300, 8300, 8300, 8300, 8300, 8400, 8401, 8401, 8500],
                   "ENCOUNTER_DATE": ['2020-01-01', '2020-03-02', '2020-04-18', '2020-07-31', '2017-06-10', '2017-09-11', '2018-02-01', '2018-04-01', '2018-05-31', '2020-12-31', '2017-08-29', '2017-09-15', '2018-10-10'],
                   "STAGE": [1, 1, 2, 1, 1, 2, 2, 3, 4, 1, 1, 3, 2]})

CodePudding user response：

Use DataFrameGroupBy.diff with compare for greater or equal by 1 and cast to integers:

df['STAGE_WORSENED'] = df.groupby('CLIENT_ID')['STAGE'].diff().ge(1).astype(int)
print (df)
    CLIENT_ID ENCOUNTER_DATE  STAGE  STAGE_WORSENED
0        8222     2020-01-01      1               0
1        8222     2020-03-02      1               0
2        8222     2020-04-18      2               1
3        8222     2020-07-31      1               0
4        8300     2017-06-10      1               0
5        8300     2017-09-11      2               1
6        8300     2018-02-01      2               0
7        8300     2018-04-01      3               1
8        8300     2018-05-31      4               1
9        8400     2020-12-31      1               0
10       8401     2017-08-29      1               0
11       8401     2017-09-15      3               1
12       8500     2018-10-10      2               0