Home > Net >  Python: Pandas Pivot DataFrame
Python: Pandas Pivot DataFrame

Time:03-02

For the below DataFrame:

ID Reg Departure Arrival Date Time
1 ABC DUB LCY 22/0/22 15:23
2 ABC DUB LCY 22/0/22 15:27
3 CBA CPH HEL 22/0/22 12:21
4 CBA CPH HE 22/0/22 12:19

I would like to pivot so that it so that the resulting DataFrame is:

ID Reg Departure Arrival Date Time ID2 Time2
1 ABC DUB LCY 22/0/22 15:23 2 15:27
3 CBA CPH HEL 22/0/22 12:21 4 12:19

I've been playing around with GroupBy & Pivot but haven't managed to crack it, any help is appreciated!

CodePudding user response:

So this will work if you have just two of each Reg row. But if you have more duplicate rows you will need something more robust.

import data and pandas

import pandas as pd

output

    ID  Reg Departure   Arrival Date    Time
0   1   ABC DUB         LCY     22/0/22 15:23
1   2   ABC DUB         LCY     22/0/22 15:27
2   3   CBA CPH         HEL     22/0/22 12:21
3   4   CBA CPH         HE      22/0/22 12:19

make two frames with with 'firsts' and one with 'lasts'

final_df = df.drop_duplicates(keep='first', subset='Departure')
id_time_df = df.drop_duplicates(keep='last', subset='Departure')

Then merge the two frames

pd.merge(final_df, id_time_df[['ID', 'Reg', 'Time']], on='Reg')

output

    ID_x    Reg Departure   Arrival Date    Time_x  ID_y    Time_y
0   1       ABC DUB         LCY     22/0/22 15:23   2       15:27
1   3       CBA CPH         HEL     22/0/22 12:21   4       12:19

CodePudding user response:

import pandas as pd
import io
str = '''ID Reg Departure   Arrival Date    Time
1   ABC DUB LCY 22/0/22 15:23
2   ABC DUB LCY 22/0/22 15:27
3   CBA CPH HEL 22/0/22 12:21
4   CBA CPH HEL 22/0/22 12:19'''

df = pd.read_csv(io.StringIO(str), sep = '\s ')
df["grpseq"] = df.groupby(["Reg", "Departure", "Arrival", "Date"]).cumcount()

df_new = df.set_index(["Reg", "Departure", "Arrival", "Date", "grpseq"]).unstack(level = -1).reset_index()
df_new.columns = ["Reg", "Departure", "Arrival", "Date", "ID1", "ID2", "Time1", "Time2"]

df_new
    Reg  Departure  Arrival    Date    ID1  ID2 Time1   Time2
0   ABC      DUB       LCY     22/0/22  1   2   15:23   15:27
1   CBA      CPH       HEL     22/0/22  3   4   12:21   12:19

Where ID1 represent the ID column in your desire table. And a simple reorder of the columns will be done.

  • Related