I have a large dataframe like this,
instance_id wake_speed_factor
wt_E70491_0.857 1,00001
wt_E70492_0.857 0,817721
wt_E70490_0.857 1,00054
wt_E70486_0.857 1
wt_E70493_0.857 0,926203
wt_E70484_0.857 0,865908
wt_E70487_0.857 1
wt_E70489_0.857 0,930648
wt_E70483_0.857 0,957561
wt_E70485_0.857 0,968135
wt_E70488_0.857 0,996474
wt_E70491_0.857 1,00001
wt_E70492_0.857 0,803226
wt_E70490_0.857 1,00029
wt_E70486_0.857 1
wt_E70493_0.857 0,852778
wt_E70484_0.857 0,828148
wt_E70487_0.857 1,00002
wt_E70489_0.857 0,881657
wt_E70483_0.857 0,896756
wt_E70485_0.857 0,921366
wt_E70488_0.857 0,999333
wt_E70491_0.857 1,00001
wt_E70492_0.857 0,878923
wt_E70490_0.857 0,999948
wt_E70486_0.857 1
wt_E70493_0.857 0,76646
wt_E70484_0.857 0,837149
wt_E70487_0.857 1,00003
wt_E70489_0.857 0,821626
wt_E70483_0.857 0,76071
wt_E70485_0.857 0,818493
wt_E70488_0.857 0,991048
Where this instance id repeats after every 11 cells. So, i wants to group them and arrange the data frame like given below.
I couldn't figure out how to convert this data frame to the data frame given below,
wt_E70491_0.857 wt_E70492_0.857 wt_E70490_0.857 wt_E70486_0.857 wt_E70493_0.857 wt_E70484_0.857 wt_E70487_0.857 wt_E70489_0.857 wt_E70483_0.857 wt_E70485_0.857 wt_E70488_0.857
1,00001 0,817721 1,00054 1 0,926203 0,865908 1 0,930648 0,957561 0,968135 0,996474
1,00001 0,803226 1,00029 1 0,852778 0,828148 1,00002 0,881657 0,896756 0,921366 0,999333
1,00001 0,878923 0,999948 1 0,76646 0,837149 1,00003 0,821626 0,76071 0,818493 0,991048
How can i do this?
CodePudding user response:
See the answer for question 10 on How can I pivot a dataframe for more info:
out = (
df
.assign(index=df.groupby("instance_id").cumcount())
.pivot(columns="instance_id", index="index", values="wake_speed_factor")
)
out:
instance_id wt_E70483_0.857 wt_E70484_0.857 wt_E70485_0.857 wt_E70486_0.857 ... wt_E70490_0.857 wt_E70491_0.857 wt_E70492_0.857 wt_E70493_0.857
index ...
0 0,957561 0,865908 0,968135 1 ... 1,00054 1,00001 0,817721 0,926203
1 0,896756 0,828148 0,921366 1 ... 1,00029 1,00001 0,803226 0,852778
2 0,76071 0,837149 0,818493 1 ... 0,999948 1,00001 0,878923 0,76646
[3 rows x 11 columns]
CodePudding user response:
You could use pivot_table and then explode:
pd.pivot_table(
df, values="wake_speed_factor", columns="instance_id", aggfunc=list
).reset_index(drop=True).explode(df["instance_id"].unique().tolist(), ignore_index=True)
instance_id wt_E70483_0.857 wt_E70484_0.857 wt_E70485_0.857 wt_E70486_0.857 wt_E70487_0.857 wt_E70488_0.857 wt_E70489_0.857 wt_E70490_0.857 wt_E70491_0.857 wt_E70492_0.857 wt_E70493_0.857
0 0,957561 0,865908 0,968135 1 1 0,996474 0,930648 1,00054 1,00001 0,817721 0,926203
1 0,896756 0,828148 0,921366 1 1,00002 0,999333 0,881657 1,00029 1,00001 0,803226 0,852778
2 0,76071 0,837149 0,818493 1 1,00003 0,991048 0,821626 0,999948 1,00001 0,878923 0,76646