I have a data table that looks like this (note this is condensed, there are some ID with 2 TIMEs):
ID TIME
1 10:12
1 4:43
2 9:12
3 1:43
3 16:47
4 2:55
I would like to flatten it so it looks like this:
ID Time_1 Time_2
1 10:12 4:43
2 4:43
3 1:43 16:47
4 2:55
I'm reviewing 'flattening' options but have not yet found the correct answer.
https://pandas.pydata.org/pandas-docs/version/0.14.1/generated/pandas.Index.flatten.html
CodePudding user response:
Try to groupby and join your strings
df.groupby('ID')['TIME'].agg(' '.join).str.split(expand=True)
0 1
ID
1 10:12 4:43
2 9:12 None
3 1:43 16:47
4 2:55 None
CodePudding user response:
To format to exactly the required layout, you can use .pivot()
as follows:
(df.assign(serial='Time_' df.groupby('ID').cumcount().add(1).astype(str))
.pivot(index='ID', columns='serial', values='TIME')
.rename_axis(columns=None)
.reset_index()
)
Result:
ID Time_1 Time_2
0 1 10:12 4:43
1 2 9:12 NaN
2 3 1:43 16:47
3 4 2:55 NaN
If you want the missing values to show as blank, you can use:
(df.assign(serial='Time_' df.groupby('ID').cumcount().add(1).astype(str))
.pivot(index='ID', columns='serial', values='TIME').fillna('')
.rename_axis(columns=None)
.reset_index()
)
Result:
ID Time_1 Time_2
0 1 10:12 4:43
1 2 9:12
2 3 1:43 16:47
3 4 2:55