i have the following dataframe
id a_1_1, a_1_2, a_1_3, a_1_4, b_1_1, b_1_2, b_1_3, c_1_1, c_1_2, c_1_3
1 10 20 30 40 90 80 70 Nan Nan Nan
2 33 34 35 36 nan nan nan 11 12 13
and i want my result to be as follow
id col_name 1 2 3
1 a 10 20 30
1 b 90 80 70
2 a 33 34 35
2 c 11 12 13
I am trying to use pd.melt
function, but not yielding correct result ?
CodePudding user response:
IIUC, you can reshape using an intermediate MultiIndex
after extracting the letter and last digit from the original column names:
(df.set_index('id')
.pipe(lambda d: d.set_axis(pd.MultiIndex.from_frame(
d.columns.str.extract(r'^([^_] ).*(\d )'),
names=['col_name', None]
), axis=1))
.stack('col_name')
.dropna(axis=1) # assuming you don't want columns with NaNs
.reset_index()
)
Variant using janitor
's pivot_longer
:
# pip install janitor
import janitor
(df
.pivot_longer(index='id', names_to=('col name', '.value'),
names_pattern=r'([^_] ).*(\d )')
.pipe(lambda d: d.dropna(thresh=d.shape[1]-2))
.dropna(axis=1)
)
output:
id col_name 1 2 3
0 1 a 10.0 20.0 30.0
1 1 b 90.0 80.0 70.0
2 2 a 33.0 34.0 35.0
3 2 c 11.0 12.0 13.0
CodePudding user response:
Code:
df = df1.melt(id_vars=["id"],
var_name="Col_name",
value_name="Value").dropna()
df['Num'] = df['Col_name'].apply(lambda x: x[-1])
df['Col_name'] = df['Col_name'].apply(lambda x: x[0])
df = df.pivot(index=['id','Col_name'], columns='Num', values='Value').reset_index().dropna(axis=1)
df
Output:
Num id Col_name 1 2 3
0 1 a 10.0 20.0 30.0
1 1 b 90.0 80.0 70.0
2 2 a 33.0 34.0 35.0
3 2 c 11.0 12.0 13.0