Home > Mobile >  Combine every two rows of pandas dataframe into separate columns
Combine every two rows of pandas dataframe into separate columns

Time:06-07

I have a dataframe with 3 columns (x, y and data), I need to combine every two rows and end up with two data columns. Essentially I need to combine the rows with the same x and y coordinates into one whilst keeping the two data variables separate creating 4 columns (x, y, u and v)

Original data example:
x      y      data
-----------------------
1      1      0.2
1      1      0.5
1      2      0.7
1      2      0.2
...

Expected output:
x      y      u      v
----------------------------
1      1      0.2    0.5
1      2      0.7    0.2
...

CodePudding user response:

if sorting your dataframe by coordinates is not an issue then you can probably try:


df = pd.DataFrame(dict(
    x=[1,1,1,1],
    y=[1,1,2,2],
    data=[0.2,0.5,0.7,0.2]
)).sort_values(by=["x", "y"])

df["label"] = ["u", "v"] * int(len(df)/2)

df = df.pivot(index=["x", "y"], columns="label", values="data").reset_index()

print(df)
x y u v
1 1 0.2 0.5
1 2 0.7 0.2

CodePudding user response:

Group by x and y, convert the groups to lists, and then create a dataframe from those lists, and append it to the original dataframe:

tmp = df.groupby(['x','y'])['data'].apply(list).reset_index()
tmp = pd.concat([x.drop('data', axis=1), pd.DataFrame(x['data'].tolist(), columns=['u', 'v'])], axis=1)

Output:

>>> tmp
   x  y    u    v
0  1  1  0.2  0.5
1  1  2  0.7  0.2
  • Related