Home > Blockchain >  Pandas Time difference of each column
Pandas Time difference of each column

Time:03-24

I have this dataframe:

ID STEP 1 STEP 2 ... STEP 40
1 2022-03-08 09:23:35 2022-03-08 10:23:35 ... 2022-03-19 09:23:35
2 2022-03-08 09:23:35 2022-03-08 11:23:35 ... 2022-03-18 09:23:35

I want to turn it into this:

ID Time 1-2 Time 1-3 ... Time 39-40
1 296 days 08:08:05 297 days 08:08:05 ... 297 days 08:08:05
2 296 days 08:08:05 297 days 08:08:05 ... 297 days 08:08:05

I created a new dataframe:

index=[]
for item1 in range (1,40):
    for item2 in range (1,40):
        if item1  < item2:
            index.append("Time from Step "  str(item1) " to step "  str(item2))

df = pd.DataFrame(-1, index=np.arange(len(caseList)), columns=index)

But I don´t know how to fill in the times.

CodePudding user response:

Use vectorial code.

If you don't have datetime type, uncomment the second step:

out = (
 df.set_index('ID')
   #.apply(lambda c: pd.to_datetime(c))
   .diff(axis=1).iloc[:, 1:]
   .set_axis([f'Time {i 1}-{i 2}' for i in range(len(df.columns)-2)], axis=1)
   .reset_index()
)

output:

   ID        Time 1-2         Time 2-3
0   1 0 days 01:00:00 10 days 23:00:00
1   2 0 days 02:00:00  9 days 22:00:00

NB. I used "STEP 40" as "STEP 3" for the example here

  • Related