replace the last 2 dates in one column by "2nd day" and "1st day" in a dataframe-CodePudding

I have this dataframe where the column date are the 2 latest working days(so they change every day when I run my code) in datetime format

               shortCode       date  ...       value TCE value
shortCode                            ...                      
A6TCE     4858     A6TCE 2022-12-19  ...         NaN   89857.0
          4859     A6TCE 2022-12-20  ...         NaN   80632.0
S2TCE     4370     S2TCE 2022-12-19  ...         NaN  103858.0
          4371     S2TCE 2022-12-20  ...         NaN   94453.0
TD1       242        TD1 2022-12-19  ...       56.44   27654.0
          243        TD1 2022-12-20  ...       54.89   24594.0

I wish to write a code which calculate dynamically the day-to-day value change for the column value and TCE value.

What is the easiest way? for the moment I have thought to pivot the df and calculate the difference with new columns, but then I wish to replace the dates first(let me know if you think this is not necessary because the aim is to append the day-to-day value change later)

desired look

               shortCode     date  ...       value TCE value
shortCode                            ...                      
A6TCE     4858     A6TCE 1st day  ...         NaN   89857.0
          4859     A6TCE 2nd day  ...         NaN   80632.0
S2TCE     4370     S2TCE 1st day  ...         NaN  103858.0
          4371     S2TCE 2nd day  ...         NaN   94453.0
TD1       242        TD1 1st day  ...       56.44   27654.0
          243        TD1 2nd day  ...       54.89   24594.0

CodePudding user response：

You can create a dictionary with the two values and their corresponding day mapping:

days= dict(zip(sorted(df["date"].unique()),["1st day","2nd day"]))

Then you can apply this dictionnary to your column:

df["date"] = df["date"].apply(lambda x: days[x])

CodePudding user response：

first_day=df["date"].min()
second_day=df["date"].max()

condition= [ df["date"]==first_day,df["date"]==second_day]
values= ['1st day','2nd day']

df['date']=np.select(condition,values)