Home > Back-end >  Rearranging Pandas Dataframe
Rearranging Pandas Dataframe

Time:10-12

I have a DataFrame as follows:

d = {'name': ['a', 'a','a','b','b','b'],
     'var': ['v1', 'v2', 'v3', 'v1', 'v2', 'v3'],
     'Yr1': [11, 21, 31, 41, 51, 61],
     'Yr2': [12, 22, 32, 42, 52, 62],
     'Yr3': [13, 23, 33, 43, 53, 63]}

df = pd.DataFrame(d)

name   var  Yr1 Yr2 Yr3
 a      v1  11  12  13
 a      v2  21  22  23
 a      v3  31  32  33
 b      v1  41  42  43
 b      v2  51  52  53
 b      v3  61  62  63

and I want to rearrange it to look like this:

name   Yr   v1  v2  v3
 a      1   11  21  31
 a      2   12  22  32
 a      3   13  23  33
 b      1   41  51  61
 b      2   42  52  62
 b      3   43  53  63

I am new to pandas and tried using other threads I found here but struggled to make it work. Any help would be much appreciated.

CodePudding user response:

Try this

import pandas as pd
d = {'name': ['a', 'a', 'a', 'b', 'b', 'b'],
     'var': ['v1', 'v2', 'v3', 'v1', 'v2', 'v3'],
     'Yr1': [11, 21, 31, 41, 51, 61],
     'Yr2': [12, 22, 32, 42, 52, 62],
     'Yr3': [13, 23, 33, 43, 53, 63]}

df = pd.DataFrame(d)


# Solution
df.set_index(['name', 'var'], inplace=True)
df = df.unstack().stack(0)

print(df.reset_index())

output:

var name level_1  v1  v2  v3
0      a     Yr1  11  21  31
1      a     Yr2  12  22  32
2      a     Yr3  13  23  33
3      b     Yr1  41  51  61
4      b     Yr2  42  52  62
5      b     Yr3  43  53  63

Reference: pandas.DataFrame.stack

CodePudding user response:

We can use pd.wide_to_long df.unstack here.

pd.wide_to_long doc:

With stubnames [‘A’, ‘B’], this function expects to find one or more groups of columns with format A-suffix1, A-suffix2,…, B-suffix1, B-suffix2,… You specify what you want to call this suffix in the resulting long format with j (for example j=’year’).

pd.wide_to_long(
    df, stubnames="Yr", i=["name", "var"], j="Y"
).squeeze().unstack(level=1).reset_index()

var name  Y  v1  v2  v3
0      a  1  11  21  31
1      a  2  12  22  32
2      a  3  13  23  33
3      b  1  41  51  61
4      b  2  42  52  62
5      b  3  43  53  63

We can use df.melt df.pivot here.

out = df.melt(id_vars=['name', 'var'], var_name='Yr')
out['Yr'] = out['Yr'].str.replace('Yr', '')
out.pivot(index=['name', 'Yr'], columns='var', values='value').reset_index()

var name Yr  v1  v2  v3
0      a  1  11  21  31
1      a  2  12  22  32
2      a  3  13  23  33
3      b  1  41  51  61
4      b  2  42  52  62
5      b  3  43  53  63

CodePudding user response:

Try groupby apply:

df.groupby("name").apply(
    lambda x: x.set_index("var").T.drop("name")
).reset_index().rename(columns={"level_1": "Yr"}).rename_axis(columns=None)

  name   Yr  v1  v2  v3
0    a  Yr1  11  21  31
1    a  Yr2  12  22  32
2    a  Yr3  13  23  33
3    b  Yr1  41  51  61
4    b  Yr2  42  52  62
5    b  Yr3  43  53  63

Or better:

df.pivot("var", "name", ["Yr1", "Yr2", "Yr3"]).T.sort_index(
    level=1
).reset_index().rename({"level_0": "Yr"}, axis=1).rename_axis(columns=None)

    Yr name  v1  v2  v3
0  Yr1    a  11  21  31
1  Yr2    a  12  22  32
2  Yr3    a  13  23  33
3  Yr1    b  41  51  61
4  Yr2    b  42  52  62
5  Yr3    b  43  53  63
  • Related