I want to combine 3 diferent timestamp into one column of timestamp

I have 3 different column that log the time the job is created, rejected and completed. i want to combine it all into one Timestamp column.

This is an example data that i have

ID	j_date_created	j_date_completed	j_date_rejected
1	23.03.2021 07:34:34	23.03.2021 07:34:36	null
2	23.03.2021 07:34:34	23.03.2021 07:34:37	null
3	23.03.2021 07:34:34	null	04.04.2021 17:48:06

the end goal that i want is

ID	timestamp	Activity
1	23.03.2021 07:34:34	created
1	23.03.2021 07:34:36	completed
2	23.03.2021 07:34:34	created
2	23.03.2021 07:34:37	completed
3	23.03.2021 07:34:34	created
3	04.04.2021 17:48:06	rejected

I am writing the project with pandas

Thanks!!

CodePudding user response：

After renaming the columns, you can get the output you need using melt.

import numpy as np
import pandas as pd

df = pd.DataFrame({'ID': [1, 2, 3],
                   'j_date_created': ['23.03.2021 07:34:34', '23.03.2021 07:34:34', '23.03.2021 07:34:34'],
                   'j_date_completed': ['23.03.2021 07:34:36', '23.03.2021 07:34:37', np.NaN],
                   'j_date_rejected': [np.NaN, np.NaN, '04.04.2021 17:48:06']})

df.columns = ['ID', 'created', 'completed', 'rejected']

df2 = df.melt(id_vars=['ID'], value_vars=['created', 'completed', 'rejected']).dropna().sort_values(by='ID')
df2.columns = ['ID', 'Activity', 'Timestamp']
print(df2)

This stacks the timestamp values in the created, completed, and rejected columns and makes a new column with those old column names as values. It also drops rows with missing values in that column, then sorts by the ID column to get the correct grouping. Last is renaming the columns to replace the default column names provided by melt.

Output:

   ID   Activity            Timestamp
0   1    created  23.03.2021 07:34:34
3   1  completed  23.03.2021 07:34:36
1   2    created  23.03.2021 07:34:34
4   2  completed  23.03.2021 07:34:37
2   3    created  23.03.2021 07:34:34
8   3   rejected  04.04.2021 17:48:06