I have a log file that has large number of columns and I would like to merge timestamp columns into one as shown in the example below
start time End time process ID. . .
10:02 11:10 A 1
10:32 11:23 B 1
... .... A 2
... .... B 2
... .... C 2
Note: Single ID can have multiple similar processes e.g. multiple A in an ID
time process ID . . .
10:02 A_start 1
11:10 A_end 1
10:32 B_start 1
11:23 B_end 1
.... A_start 2
.... A_end 2
.... B_start 2
.... ... ..
Trying with melt on pandas but it seem I am missing something here
df.melt(id_vars=["ID", "process"],
var_name="End time",
)
CodePudding user response:
IIUC, you can use melt
with a bit of reshaping:
out = (df
.rename(columns={'start time': 'start', 'End time': 'end'})
.melt(['process', 'ID'], value_name='time')
.assign(process=lambda d: d['process'] '_' d.pop('variable'))
)
output:
process ID time
0 A_start 1 10:02
1 B_start 1 10:32
2 A_end 1 11:10
3 B_end 1 11:23
used input:
start time End time process ID
0 10:02 11:10 A 1
1 10:32 11:23 B 1