How to transpose or pivote a table? Selecting specific columns-CodePudding

beginner here! I have a dataframe similar to this:

df = pd.DataFrame({'Country_Code':['FR','FR','FR','USA','USA','USA','BR','BR','BR'],'Indicator_Name':['GPD','Pop','birth','GPD','Pop','birth','GPD','Pop','birth'],'2005':[14,34,56, 25, 67, 68, 55, 8,99], '2006':[23, 34, 34, 43,34,34, 65, 34,45]})

Index Country_Code Inndicator_Name 2005 2006
0     FR                       GPD    14  23
1     FR                       Pop    34  34
2     FR                      birth   56  34
3     USA                      GPD    25  43
4     USA                      Pop    67  34
5     USA                     birth   68  34
6     BR                       GPD    55  65
7     BR                       Pop    8   34
8     BR                      birth   99  45

I need to pivot or transpose it, keeping the Country Code, the years, and the indicators names as columns, like this:

index Country_Code year GPD Pop Birth
0     FR           2005 14  34  56
1     FR           2006 23  34  34
3     USA          2005 25  67  68
4     USA          2006 43  34  34

...

I used the transposed function like this:

df.set_index(['Indicator Name']).transpose()

The result is nice, but I have the Countries as a row like this:

Inndicator_Name GPD Pop birth   GPD Pop birth   GPD Pop birth
Country_Code    FR  FR  FR      USA USA USA     BR  BR  BR
2005            14  34  56      25  67  68      55  8   99
2006            23  34  34      43  34  34      65  34  45

I also tried to use the "pivot" and the "pivot table" function, but the result is not satisfactory. Could you please give me some advice?

CodePudding user response：

import pandas as pd
df = pd.DataFrame({'Country_Code':['FR','FR','FR','USA','USA','USA','BR','BR','BR'],'Indicator_Name':['GPD','Pop','birth','GPD','Pop','birth','GPD','Pop','birth'],'2005':[14,34,56, 25, 67, 68, 55, 8,99], '2006':[23, 34, 34, 43,34,34, 65, 34,45]})
df
#%% Pivot longer columns `'2005'` and `'2006'` to `'Year'`
df1 = df.melt(id_vars=["Country_Code", "Indicator_Name"], 
        var_name="Year", 
        value_name="Value")
#%% Pivot wider by values in `'Indicator_Name'`
df2 = (df1.pivot_table(index=['Country_Code', 'Year'], 
                      columns=['Indicator_Name'], 
                      values=['Value'], 
                      aggfunc='first'))

Output:

                  Value          
Indicator_Name      GPD Pop birth
Country_Code Year                
BR           2005    55   8    99
             2006    65  34    45
FR           2005    14  34    56
             2006    23  34    34
USA          2005    25  67    68
             2006    43  34    34

CodePudding user response：

The simplest in my opinion, you can pivot stack:

(df.pivot(index='Country_Code', columns='Indicator_Name')
   .rename_axis(columns=['year', None]).stack(0).reset_index()
)

output:

  Country_Code  year  GPD  Pop  birth
0           BR  2005   55    8     99
1           BR  2006   65   34     45
2           FR  2005   14   34     56
3           FR  2006   23   34     34
4          USA  2005   25   67     68
5          USA  2006   43   34     34