I am trying to convert a dataframe
of structure:
ID ID2 ID3 R_u L_u R_sd L_sd
1 F G1 x y z t
2 M G2 x y z t
into
ID ID2 ID3 Side u sd
1 F G1 R x z
1 F G1 L y t
2 M G2 R x z
2 M G2 L y t
I used pandas.melt
function
df_melt = df(id_vars=[('ID')], value_vars=['R_u', 'L_u'],
var_name='Side', value_name = 'u')
but I couldn't find a way for more than four or six number of columns simultaneously. I guess I can start with melt and then feed each row using lambda
but I feel like I can do this automatically.
Any possible solution, please?
CodePudding user response:
Good use case for janitor
's pivot_longer
:
# pip install janitor
import janitor
out = df.pivot_longer(index="ID*", names_to=['Side', '.value'], names_sep='_',
sort_by_appearance=True # optional
)
Output:
ID ID2 ID3 Side u sd
0 1 F G1 R x z
1 1 F G1 L y t
2 2 M G2 R x z
3 2 M G2 L y t
With pure pandas, using reshaping and a MultiIndex:
cols = list(df.filter(like='ID'))
out = (df.set_index(cols)
.pipe(lambda d: d.set_axis(d.columns.str.split('_', expand=True), axis=1))
.rename_axis(columns=('Side', None))
.stack(0).reset_index()
)
Output:
ID ID2 ID3 Side sd u
0 1 F G1 L t y
1 1 F G1 R z x
2 2 M G2 L t y
3 2 M G2 R z x
CodePudding user response:
Set the ID column as index then split and expand the remaining columns to convert to multiindex then stack the level 0 of multiindex to reshape
s = df.set_index('ID')
s.columns = s.columns.str.split('_', expand=True)
s = s.stack(0)
Alternative approach with wide_to_long
pd.wide_to_long(df, i='ID', j='Side', stubnames=['R', 'L'], sep='_', suffix='\w ').stack().unstack(-2)
Result
sd u
ID
1 L t y
R z x
2 L t y
R z x
Another approach with wide_to_long
, which might be a bit longer:
temp = df.set_index(['ID','ID2','ID3'])
temp.columns = temp.columns.str.split("_").str[::-1].str.join("_")
(pd
.wide_to_long(
temp.reset_index(),
i= temp.index.names,
j = 'Side',
stubnames = ['u', 'sd'],
sep='_',
suffix='. ')
.reset_index()
)
ID ID2 ID3 Side u sd
0 1 F G1 R x z
1 1 F G1 L y t
2 2 M G2 R x z
3 2 M G2 L y t