Initial dataframe looks as follows:
>>>>df
id param
1 4
1 15
1 3
2 2
2 7
4 8
4 6
4 11
How to achieve the following scheme by putting only the first 2 values of each id into new row? Resulting df should look as follows:
>>>>df
col_a col_b
4 15
2 7
8 6
I tried to achieve by using transpose and iloc but did not succeed. Columns names are just for clarification. It is sufficient if index is displayed only (e.g. 0, 1, 2,..).
CodePudding user response:
You can use a double groupby
on 'id' to first get the first two rows of each group and then join your 'param' column, and thereafter expand it into new columns. Lastly, rename accordingly:
new = df.groupby('id').head(2).groupby('id',as_index=False).agg({'param':list}).param.apply(pd.Series)
new.columns = ['col_a', 'col_b']
Prints:
col_a col_b
0 4 15
1 2 7
2 8 6
CodePudding user response:
You can first take groupby with head(2) and then split every 2 elements in a list:
a = df.groupby("id")['param'].head(2).tolist()
out = pd.DataFrame([a[i:i 2] for i in range(0, len(a), 2)],columns=['col_a','col_b'])
print(out)
col_a col_b
0 4 15
1 2 7
2 8 6