Home > Net >  How to transpose or pivote a table? Selecting specific columns
How to transpose or pivote a table? Selecting specific columns

Time:12-05

beginner here! I have a dataframe similar to this:

df = pd.DataFrame({'Country_Code':['FR','FR','FR','USA','USA','USA','BR','BR','BR'],'Indicator_Name':['GPD','Pop','birth','GPD','Pop','birth','GPD','Pop','birth'],'2005':[14,34,56, 25, 67, 68, 55, 8,99], '2006':[23, 34, 34, 43,34,34, 65, 34,45]})
Index Country_Code Inndicator_Name 2005 2006
0     FR                       GPD    14  23
1     FR                       Pop    34  34
2     FR                      birth   56  34
3     USA                      GPD    25  43
4     USA                      Pop    67  34
5     USA                     birth   68  34
6     BR                       GPD    55  65
7     BR                       Pop    8   34
8     BR                      birth   99  45

I need to pivot or transpose it, keeping the Country Code, the years, and the indicators names as columns, like this:

index Country_Code year GPD Pop Birth
0     FR           2005 14  34  56
1     FR           2006 23  34  34
3     USA          2005 25  67  68
4     USA          2006 43  34  34

...

I used the transposed function like this:

df.set_index(['Indicator Name']).transpose()

The result is nice, but I have the Countries as a row like this:

Inndicator_Name GPD Pop birth   GPD Pop birth   GPD Pop birth
Country_Code    FR  FR  FR      USA USA USA     BR  BR  BR
2005            14  34  56      25  67  68      55  8   99
2006            23  34  34      43  34  34      65  34  45

I also tried to use the "pivot" and the "pivot table" function, but the result is not satisfactory. Could you please give me some advice?

CodePudding user response:

import pandas as pd
df = pd.DataFrame({'Country_Code':['FR','FR','FR','USA','USA','USA','BR','BR','BR'],'Indicator_Name':['GPD','Pop','birth','GPD','Pop','birth','GPD','Pop','birth'],'2005':[14,34,56, 25, 67, 68, 55, 8,99], '2006':[23, 34, 34, 43,34,34, 65, 34,45]})
df
#%% Pivot longer columns `'2005'` and `'2006'` to `'Year'`
df1 = df.melt(id_vars=["Country_Code", "Indicator_Name"], 
        var_name="Year", 
        value_name="Value")
#%% Pivot wider by values in `'Indicator_Name'`
df2 = (df1.pivot_table(index=['Country_Code', 'Year'], 
                      columns=['Indicator_Name'], 
                      values=['Value'], 
                      aggfunc='first'))

Output:

                  Value          
Indicator_Name      GPD Pop birth
Country_Code Year                
BR           2005    55   8    99
             2006    65  34    45
FR           2005    14  34    56
             2006    23  34    34
USA          2005    25  67    68
             2006    43  34    34

CodePudding user response:

The simplest in my opinion, you can pivot stack:

(df.pivot(index='Country_Code', columns='Indicator_Name')
   .rename_axis(columns=['year', None]).stack(0).reset_index()
)

output:

  Country_Code  year  GPD  Pop  birth
0           BR  2005   55    8     99
1           BR  2006   65   34     45
2           FR  2005   14   34     56
3           FR  2006   23   34     34
4          USA  2005   25   67     68
5          USA  2006   43   34     34
  • Related