Home > front end >  python dataframe melt with compound index
python dataframe melt with compound index

Time:05-13

I have a dataframe like:

index sample_1 sample_2
A/1 23 33
A/2 24 34
A/3 25 35

I want to convert to:

index letter number value sample
1 A 1 23 sample_1
2 A 1 33 sample_2
3 A 2 24 sample_1
4 A 2 34 sample_2
5 A 3 25 sample_1
6 A 3 35 sample_2

I have tried to use melt function, but the compounded index columns are loss.


CodePudding user response:

You can do that with str.split the column and then df.melt.

df[['letter','index']] = df['index'].str.split('/', expand=True)
res = (df.melt(id_vars=['letter','index'],var_name='sample')
       .sort_values(by='index')
       .rename(columns={ 'index' : 'number'})
      .reset_index(drop=True)
      )
print(res)

  letter number    sample  value
0      A      1  sample_1     23
1      A      1  sample_2     33
2      A      2  sample_1     24
3      A      2  sample_2     34
4      A      3  sample_1     25
5      A      3  sample_2     35

CodePudding user response:

using melt :

df = df.melt(value_vars=['sample_1','sample_2'], id_vars='index',var_name='sample')
df[['letter','number']] = df['index'].str.split('/',expand=True)

output:

>>>
  index    sample  value letter number
0   A/1  sample_1     23      A      1
1   A/2  sample_1     24      A      2
2   A/3  sample_1     25      A      3
3   A/1  sample_2     33      A      1
4   A/2  sample_2     34      A      2
5   A/3  sample_2     35      A      3

CodePudding user response:

You can try pd.wide_to_long then split and rename columns

df_ = pd.wide_to_long(df, ["sample_"], i="index", j="sample").reset_index()
df_[['index', 'number']] = df_['index'].str.split('/', expand=True)
df_ = df_.rename(columns={'index': 'letter', 'sample_': 'value'})
df_ = df_.assign(sample='sample_'   df_['sample'].astype(str), index=df_.index 1)
print(df_)

  letter    sample  value number  index
0      A  sample_1     23      1      1
1      A  sample_1     24      2      2
2      A  sample_1     25      3      3
3      A  sample_2     33      1      4
4      A  sample_2     34      2      5
5      A  sample_2     35      3      6

For a more nested version, you can use

df[['index', 'number']] = df['index'].str.split('/', expand=True)
df = (pd.wide_to_long(df, ["sample_"], i=["index", 'number'], j="sample").reset_index()
      .rename(columns={'index': 'letter', 'sample_': 'value'})
      .pipe(lambda df: df.assign(sample='sample_'   df['sample'].astype(str), index=df.index 1))
      )
  • Related