I would like to modified my dataframe using melt and table pivot.
I have this dataframe :
df = pd.DataFrame({'Pays': {0: 'France', 1: 'France', 2: 'France'},
'Indicateur': {0: 'Internet', 1: 'Pop', 2: 'Eco'},
'1990': {0: 1, 1: 2, 2: 3},
'1995': {0: 7, 1: 8, 2: 9}})
df
I gather columns date into rows with melt :
df = pd.melt(df, id_vars =['Pays','Indicateur'], value_vars =['1990','1995'],
var_name ='Years', value_name ='valeur')
df
I spread 'Indicateur' rows into columns :
df2 = df.pivot(columns='Indicateur',values='valeur')
df2
I would like this result :
You know how to do it ?
Thank you !
CodePudding user response:
First is not necessary defined years in value_vars
in DataFrame.melt
, if need all of them:
df = df.melt(id_vars =['Pays','Indicateur'], var_name ='Years', value_name ='valeur')
Also is necessary defined index
parameter in DataFrame.pivot
, last convert MultiIndex
to columns by DataFrame.reset_index
and then remove column name
by DataFrame.rename_axis
:
df2 = (df.pivot(index=['Pays', 'Years'], columns='Indicateur',values='valeur')
.reset_index()
.rename_axis(columns=None))
print(df2)
Pays Years Eco Internet Pop
0 France 1990 3 1 2
1 France 1995 9 7 8
EDIT: If need same format like melted DataFrame:
df1 = df[['Pays','Years']].join(df.set_index('Indicateur', append=True)['valeur'].unstack())
#or using your solution
df1 = df[['Pays','Years']].join(df.pivot(columns='Indicateur',values='valeur'))
print(df1)
Pays Years Eco Internet Pop
0 France 1990 NaN 1.0 NaN
1 France 1990 NaN NaN 2.0
2 France 1990 3.0 NaN NaN
3 France 1995 NaN 7.0 NaN
4 France 1995 NaN NaN 8.0
5 France 1995 9.0 NaN NaN