beginner here! I have a dataframe similar to this:
df = pd.DataFrame({'Country_Code':['FR','FR','FR','USA','USA','USA','BR','BR','BR'],'Indicator_Name':['GPD','Pop','birth','GPD','Pop','birth','GPD','Pop','birth'],'2005':[14,34,56, 25, 67, 68, 55, 8,99], '2006':[23, 34, 34, 43,34,34, 65, 34,45]})
Index Country_Code Inndicator_Name 2005 2006 0 FR GPD 14 23 1 FR Pop 34 34 2 FR birth 56 34 3 USA GPD 25 43 4 USA Pop 67 34 5 USA birth 68 34 6 BR GPD 55 65 7 BR Pop 8 34 8 BR birth 99 45
I need to pivot or transpose it, keeping the Country Code, the years, and the indicators names as columns, like this:
index Country_Code year GPD Pop Birth
0 FR 2005 14 34 56
1 FR 2006 23 34 34
3 USA 2005 25 67 68
4 USA 2006 43 34 34
...
I used the transposed function like this:
df.set_index(['Indicator Name']).transpose()
The result is nice, but I have the Countries as a row like this:
Inndicator_Name GPD Pop birth GPD Pop birth GPD Pop birth
Country_Code FR FR FR USA USA USA BR BR BR
2005 14 34 56 25 67 68 55 8 99
2006 23 34 34 43 34 34 65 34 45
I also tried to use the "pivot" and the "pivot table" function, but the result is not satisfactory. Could you please give me some advice?
CodePudding user response:
import pandas as pd
df = pd.DataFrame({'Country_Code':['FR','FR','FR','USA','USA','USA','BR','BR','BR'],'Indicator_Name':['GPD','Pop','birth','GPD','Pop','birth','GPD','Pop','birth'],'2005':[14,34,56, 25, 67, 68, 55, 8,99], '2006':[23, 34, 34, 43,34,34, 65, 34,45]})
df
#%% Pivot longer columns `'2005'` and `'2006'` to `'Year'`
df1 = df.melt(id_vars=["Country_Code", "Indicator_Name"],
var_name="Year",
value_name="Value")
#%% Pivot wider by values in `'Indicator_Name'`
df2 = (df1.pivot_table(index=['Country_Code', 'Year'],
columns=['Indicator_Name'],
values=['Value'],
aggfunc='first'))
Output:
Value
Indicator_Name GPD Pop birth
Country_Code Year
BR 2005 55 8 99
2006 65 34 45
FR 2005 14 34 56
2006 23 34 34
USA 2005 25 67 68
2006 43 34 34
CodePudding user response:
The simplest in my opinion, you can pivot
stack
:
(df.pivot(index='Country_Code', columns='Indicator_Name')
.rename_axis(columns=['year', None]).stack(0).reset_index()
)
output:
Country_Code year GPD Pop birth
0 BR 2005 55 8 99
1 BR 2006 65 34 45
2 FR 2005 14 34 56
3 FR 2006 23 34 34
4 USA 2005 25 67 68
5 USA 2006 43 34 34