I have a dataframe like:
index | sample_1 | sample_2 |
---|---|---|
A/1 | 23 | 33 |
A/2 | 24 | 34 |
A/3 | 25 | 35 |
I want to convert to:
index | letter | number | value | sample |
---|---|---|---|---|
1 | A | 1 | 23 | sample_1 |
2 | A | 1 | 33 | sample_2 |
3 | A | 2 | 24 | sample_1 |
4 | A | 2 | 34 | sample_2 |
5 | A | 3 | 25 | sample_1 |
6 | A | 3 | 35 | sample_2 |
I have tried to use melt function, but the compounded index columns are loss.
CodePudding user response:
You can do that with str.split
the column and then df.melt
.
df[['letter','index']] = df['index'].str.split('/', expand=True)
res = (df.melt(id_vars=['letter','index'],var_name='sample')
.sort_values(by='index')
.rename(columns={ 'index' : 'number'})
.reset_index(drop=True)
)
print(res)
letter number sample value
0 A 1 sample_1 23
1 A 1 sample_2 33
2 A 2 sample_1 24
3 A 2 sample_2 34
4 A 3 sample_1 25
5 A 3 sample_2 35
CodePudding user response:
using melt :
df = df.melt(value_vars=['sample_1','sample_2'], id_vars='index',var_name='sample')
df[['letter','number']] = df['index'].str.split('/',expand=True)
output:
>>>
index sample value letter number
0 A/1 sample_1 23 A 1
1 A/2 sample_1 24 A 2
2 A/3 sample_1 25 A 3
3 A/1 sample_2 33 A 1
4 A/2 sample_2 34 A 2
5 A/3 sample_2 35 A 3
CodePudding user response:
You can try pd.wide_to_long
then split and rename columns
df_ = pd.wide_to_long(df, ["sample_"], i="index", j="sample").reset_index()
df_[['index', 'number']] = df_['index'].str.split('/', expand=True)
df_ = df_.rename(columns={'index': 'letter', 'sample_': 'value'})
df_ = df_.assign(sample='sample_' df_['sample'].astype(str), index=df_.index 1)
print(df_)
letter sample value number index
0 A sample_1 23 1 1
1 A sample_1 24 2 2
2 A sample_1 25 3 3
3 A sample_2 33 1 4
4 A sample_2 34 2 5
5 A sample_2 35 3 6
For a more nested version, you can use
df[['index', 'number']] = df['index'].str.split('/', expand=True)
df = (pd.wide_to_long(df, ["sample_"], i=["index", 'number'], j="sample").reset_index()
.rename(columns={'index': 'letter', 'sample_': 'value'})
.pipe(lambda df: df.assign(sample='sample_' df['sample'].astype(str), index=df.index 1))
)