I have a dataframe with datetime objects in columns 'start' . I want to sort these dates into new columns : evry time ID start to a new location with order
df = pd.DataFrame(data={'ID':['a1','a2','a1','a1','a2','a2'],
'location':['bali','mosta','road','joha','alabama','vinice'],
'start':[pd.to_datetime('2022-11-18 16:28:35'),
pd.to_datetime('2022-11-18 17:28:35'),
pd.to_datetime('2022-11-19 16:28:35'),
pd.to_datetime('2022-11-19 17:28:35'),
pd.to_datetime('2022-11-19 17:18:35'),
pd.to_datetime('2022-11-19 17:18:35')
]})
i want same yhinkg like that :
new_data = pd.DataFrame(data={'ID':['a1','a2',],
'location1':['bali','mosta'],
'start1':[pd.to_datetime('2022-11-18 16:28:35'),pd.to_datetime('2022-11-18 17:28:35') ],
'location2':['road','alabama'],
'start2': [pd.to_datetime('2022-11-19 16:28:35'),pd.to_datetime('2022-11-19 17:18:35')],
'location3':['joha','vinice'],
'start3': [pd.to_datetime('2022-11-19 17:28:35'),pd.to_datetime('2022-11-19 17:18:35')],
})
CodePudding user response:
Try:
df["tmp"] = df.groupby("ID").cumcount() 1
df = df.pivot(index="ID", columns="tmp")
df.columns = [f"{t}_{n}" for t, n in df.columns]
df = df[sorted(df, key=lambda k: ((int((i := k.split("_"))[1])), i[0]))]
print(df.reset_index())
Prints:
ID location_1 start_1 location_2 start_2 location_3 start_3
0 a1 bali 2022-11-18 16:28:35 road 2022-11-19 16:28:35 joha 2022-11-19 17:28:35
1 a2 mosta 2022-11-18 17:28:35 alabama 2022-11-19 17:18:35 vinice 2022-11-19 17:18:35