I have a DataFrame as follows:
d = {'name': ['a', 'a','a','b','b','b'],
'var': ['v1', 'v2', 'v3', 'v1', 'v2', 'v3'],
'Yr1': [11, 21, 31, 41, 51, 61],
'Yr2': [12, 22, 32, 42, 52, 62],
'Yr3': [13, 23, 33, 43, 53, 63]}
df = pd.DataFrame(d)
name var Yr1 Yr2 Yr3
a v1 11 12 13
a v2 21 22 23
a v3 31 32 33
b v1 41 42 43
b v2 51 52 53
b v3 61 62 63
and I want to rearrange it to look like this:
name Yr v1 v2 v3
a 1 11 21 31
a 2 12 22 32
a 3 13 23 33
b 1 41 51 61
b 2 42 52 62
b 3 43 53 63
I am new to pandas and tried using other threads I found here but struggled to make it work. Any help would be much appreciated.
CodePudding user response:
Try this
import pandas as pd
d = {'name': ['a', 'a', 'a', 'b', 'b', 'b'],
'var': ['v1', 'v2', 'v3', 'v1', 'v2', 'v3'],
'Yr1': [11, 21, 31, 41, 51, 61],
'Yr2': [12, 22, 32, 42, 52, 62],
'Yr3': [13, 23, 33, 43, 53, 63]}
df = pd.DataFrame(d)
# Solution
df.set_index(['name', 'var'], inplace=True)
df = df.unstack().stack(0)
print(df.reset_index())
output:
var name level_1 v1 v2 v3
0 a Yr1 11 21 31
1 a Yr2 12 22 32
2 a Yr3 13 23 33
3 b Yr1 41 51 61
4 b Yr2 42 52 62
5 b Yr3 43 53 63
Reference: pandas.DataFrame.stack
CodePudding user response:
We can use pd.wide_to_long
df.unstack
here.
With stubnames [‘A’, ‘B’], this function expects to find one or more groups of columns with format A-suffix1, A-suffix2,…, B-suffix1, B-suffix2,… You specify what you want to call this suffix in the resulting long format with j (for example j=’year’).
pd.wide_to_long(
df, stubnames="Yr", i=["name", "var"], j="Y"
).squeeze().unstack(level=1).reset_index()
var name Y v1 v2 v3
0 a 1 11 21 31
1 a 2 12 22 32
2 a 3 13 23 33
3 b 1 41 51 61
4 b 2 42 52 62
5 b 3 43 53 63
We can use df.melt
df.pivot
here.
out = df.melt(id_vars=['name', 'var'], var_name='Yr')
out['Yr'] = out['Yr'].str.replace('Yr', '')
out.pivot(index=['name', 'Yr'], columns='var', values='value').reset_index()
var name Yr v1 v2 v3
0 a 1 11 21 31
1 a 2 12 22 32
2 a 3 13 23 33
3 b 1 41 51 61
4 b 2 42 52 62
5 b 3 43 53 63
CodePudding user response:
Try groupby
apply:
df.groupby("name").apply(
lambda x: x.set_index("var").T.drop("name")
).reset_index().rename(columns={"level_1": "Yr"}).rename_axis(columns=None)
name Yr v1 v2 v3
0 a Yr1 11 21 31
1 a Yr2 12 22 32
2 a Yr3 13 23 33
3 b Yr1 41 51 61
4 b Yr2 42 52 62
5 b Yr3 43 53 63
Or better:
df.pivot("var", "name", ["Yr1", "Yr2", "Yr3"]).T.sort_index(
level=1
).reset_index().rename({"level_0": "Yr"}, axis=1).rename_axis(columns=None)
Yr name v1 v2 v3
0 Yr1 a 11 21 31
1 Yr2 a 12 22 32
2 Yr3 a 13 23 33
3 Yr1 b 41 51 61
4 Yr2 b 42 52 62
5 Yr3 b 43 53 63