Home > Enterprise >  How to create a flag column indicating change based on the change in value from the previous value t
How to create a flag column indicating change based on the change in value from the previous value t

Time:09-16

I have a data frame that is sorted by CLIENT_ID and ENCOUNTER_DATE as shown below:

CLIENT_ID ENCOUNTER_DATE STAGE
8222 2020-01-01 1
8222 2020-03-02 1
8222 2020-04-18 2
8222 2020-07-31 1
8300 2017-06-10 1
8300 2017-09-11 2
8300 2018-02-01 2
8300 2018-04-01 3
8300 2018-05-31 4
8400 2020-12-31 1
8401 2017-08-29 1
8401 2017-09-15 3
8500 2018-10-10 2

How can I create a new column (flag column) which indicates whether or not the STAGE from the previous DATE_ENCOUNTER for each CLIENT_ID is greater than the current DATE_ENCOUNTER which would result in the table below:

CLIENT_ID ENCOUNTER_DATE STAGE STAGE_WORSENED
8222 2020-01-01 1 0
8222 2020-03-02 1 0
8222 2020-04-18 2 1
8222 2020-07-31 1 0
8300 2017-06-10 1 0
8300 2017-09-11 2 1
8300 2018-02-01 2 0
8300 2018-04-01 3 1
8300 2018-05-31 4 1
8400 2020-12-31 1 0
8401 2017-08-29 1 0
8401 2017-09-15 3 1
8500 2018-10-10 2 0

Here is the code to generate the df:

df = pd.DataFrame({"CLIENT_ID": [8222, 8222, 8222, 8222, 8300, 8300, 8300, 8300, 8300, 8400, 8401, 8401, 8500],
                   "ENCOUNTER_DATE": ['2020-01-01', '2020-03-02', '2020-04-18', '2020-07-31', '2017-06-10', '2017-09-11', '2018-02-01', '2018-04-01', '2018-05-31', '2020-12-31', '2017-08-29', '2017-09-15', '2018-10-10'],
                   "STAGE": [1, 1, 2, 1, 1, 2, 2, 3, 4, 1, 1, 3, 2]})

CodePudding user response:

Use DataFrameGroupBy.diff with compare for greater or equal by 1 and cast to integers:

df['STAGE_WORSENED'] = df.groupby('CLIENT_ID')['STAGE'].diff().ge(1).astype(int)
print (df)
    CLIENT_ID ENCOUNTER_DATE  STAGE  STAGE_WORSENED
0        8222     2020-01-01      1               0
1        8222     2020-03-02      1               0
2        8222     2020-04-18      2               1
3        8222     2020-07-31      1               0
4        8300     2017-06-10      1               0
5        8300     2017-09-11      2               1
6        8300     2018-02-01      2               0
7        8300     2018-04-01      3               1
8        8300     2018-05-31      4               1
9        8400     2020-12-31      1               0
10       8401     2017-08-29      1               0
11       8401     2017-09-15      3               1
12       8500     2018-10-10      2               0
  • Related