Home > OS >  Changing the Structure of a Dataframe in Python
Changing the Structure of a Dataframe in Python

Time:10-15

I need help to change the structure to a pandas dataframe with many columns like the example:

original dataframe:

| xx | yy | zz | a | b | c | k |
|:---|:---|:---|:--|:--|:--|:--|
| x1 | y1 | z1 | 0 | 2 | 1 | 3 |
| x2 | y2 | z2 | 1 | 0 | 2 | 0 |

I need just the first 3 columns and change the rest

new dataframe:

| xx | yy | zz | valor | nueva columna| 
|:---|:---|:---|:--|:--|
| x1 | y1 | z1 | 0 | a |
| x1 | y1 | z1 | 2 | b |
| x1 | y1 | z1 | 1 | c |
| x1 | y1 | z1 | 3 | k |
| x2 | y2 | z2 | 1 | a |
| x2 | y2 | z2 | 0 | b |
| x2 | y2 | z2 | 2 | c |
| x2 | y2 | z2 | 0 | k |

I get a solution with a for loop, but in colab when the columns and rows are many the time is excessive

CodePudding user response:

df1 = df.set_index(['xx','yy','zz']).stack()
df1.reset_index()

enter image description here

CodePudding user response:

As suggested by @HenryYik, melt works and provide the expected result :

>>> pd.melt(df, id_vars=['xx', 'yy', 'zz'], value_vars=['a', 'b', 'c', 'k']).sort_values(by=['xx'])
    xx  yy  zz  variable    value
0   x1  y1  z1  a           0
2   x1  y1  z1  b           2
4   x1  y1  z1  c           1
6   x1  y1  z1  k           3
1   x2  y2  z2  a           1
3   x2  y2  z2  b           0
5   x2  y2  z2  c           2
7   x2  y2  z2  k           0

UPDATE : As suggested by @Nesha25, a sort_values is added to get the exact expected output. Thanks @Nesha25.

  • Related