Melt four (or more) dataframe columns into two rows-CodePudding

I am trying to convert a dataframe of structure:

ID ID2 ID3 R_u L_u R_sd L_sd
1  F   G1   x   y   z    t
2  M   G2   x   y   z    t

into

ID ID2 ID3 Side u sd
1  F   G1   R    x z
1  F   G1   L    y t 
2  M   G2   R    x z
2  M   G2   L    y t

I used pandas.melt function

df_melt = df(id_vars=[('ID')], value_vars=['R_u', 'L_u'], 
                                  var_name='Side', value_name = 'u')

but I couldn't find a way for more than four or six number of columns simultaneously. I guess I can start with melt and then feed each row using lambda but I feel like I can do this automatically.

Any possible solution, please?

CodePudding user response：

Good use case for janitor's pivot_longer:

# pip install janitor
import janitor

out = df.pivot_longer(index="ID*", names_to=['Side', '.value'], names_sep='_',
                      sort_by_appearance=True # optional
                      )

Output:

   ID ID2 ID3 Side  u sd
0   1   F  G1    R  x  z
1   1   F  G1    L  y  t
2   2   M  G2    R  x  z
3   2   M  G2    L  y  t

With pure pandas, using reshaping and a MultiIndex:

cols = list(df.filter(like='ID'))

out = (df.set_index(cols)
         .pipe(lambda d: d.set_axis(d.columns.str.split('_', expand=True), axis=1))
         .rename_axis(columns=('Side', None))
         .stack(0).reset_index()
      )

Output:

   ID ID2 ID3 Side sd  u
0   1   F  G1    L  t  y
1   1   F  G1    R  z  x
2   2   M  G2    L  t  y
3   2   M  G2    R  z  x

CodePudding user response：

Set the ID column as index then split and expand the remaining columns to convert to multiindex then stack the level 0 of multiindex to reshape

s = df.set_index('ID')
s.columns = s.columns.str.split('_', expand=True)
s = s.stack(0)

Alternative approach with wide_to_long

pd.wide_to_long(df, i='ID', j='Side', stubnames=['R', 'L'], sep='_', suffix='\w ').stack().unstack(-2)

Result

     sd  u
ID        
1  L  t  y
   R  z  x
2  L  t  y
   R  z  x

Another approach with wide_to_long, which might be a bit longer:

temp = df.set_index(['ID','ID2','ID3'])
temp.columns = temp.columns.str.split("_").str[::-1].str.join("_")
(pd
.wide_to_long(
    temp.reset_index(), 
    i= temp.index.names, 
    j = 'Side', 
    stubnames = ['u', 'sd'], 
    sep='_', 
    suffix='. ')
.reset_index()
)

   ID ID2 ID3 Side  u sd
0   1   F  G1    R  x  z
1   1   F  G1    L  y  t
2   2   M  G2    R  x  z
3   2   M  G2    L  y  t