Home > database >  Pandas. Convert table structure
Pandas. Convert table structure

Time:11-11

How can I convert this table:

iso pref_name pref_score
AFG pref_01 1
AFG pref_02 2
AFG pref_03 3
ZMB pref_01 3
ZMB pref_02 2
ZMB pref_03 1
ARM NaN NaN
ARM NaN NaN
ARM NaN NaN

to this table:

iso pref_01 pref_02 pref_03
AFG 1 2 3
ZMB 3 2 1
ARM NaN NaN NaN

and vice-versa

I've tried to pivot it (How can I pivot a dataframe?) but it wasn't sucessful.

CodePudding user response:

You cannot pivot directly as there is no information to pivot the ARM values of iso (all pref_name are NaN).

You can pivot without it (using dropna) and reindex with the unique values of iso afterwards:

(df.dropna()
   .pivot(index='iso', columns='pref_name', values='pref_score')
   .reindex(df['iso'].unique())
)

output:

pref_name  pref_01  pref_02  pref_03
iso                                 
AFG            1.0      2.0      3.0
ZMB            3.0      2.0      1.0
ARM            NaN      NaN      NaN

CodePudding user response:

I think this is a bug with pandas. I'm using version:

In [334]: pd.__version__
Out[334]: '1.2.3'

As per pivot_table docs:

dropna bool, default True. Do not include columns whose entries are all NaN.

So ideally using dropna=False should solve your issue. But it doesn't.

In [333]: pd.pivot_table(df, 'pref_score', 'iso', 'pref_name', dropna=False)
Out[333]: 
pref_name  pref_01  pref_02  pref_03
iso                                 
AFG            1.0      2.0      3.0
ZMB            3.0      2.0      1.0

It's better to open a bug with pandas.

CodePudding user response:

Stack and unstack()

df.set_index(['iso', 'pref_name']).stack().unstack(level=1)

    pref_name     pref_01  pref_02  pref_03
iso                                      
AFG pref_score      1.0      2.0      3.0
ZMB pref_score      3.0      2.0      1.0

CodePudding user response:

As a work around, you can convert your iso column to a categorical dtype first and then run pivot_table. By using a categorical dtype with the default argument of pd.pivot_table(..., observed=False) you can essentially force all values of your groupers ("iso" and/or "pref_name" in this case) to be present in the output.

df.astype({"iso": "category"}).pivot_table('pref_score', 'iso', 'pref_name', dropna=False)

pref_name  pref_01  pref_02  pref_03
iso
AFG            1.0      2.0      3.0
ARM            NaN      NaN      NaN
ZMB            3.0      2.0      1.0
  • Related