Home > Software engineering >  pandas groupby() and pivot_table() .style.format() from float to currency for specific column
pandas groupby() and pivot_table() .style.format() from float to currency for specific column

Time:05-05

I've got a Pandas DataFrame looking like this:

                           payment_method   amount  status
datetime_local          
2022-04-27 00:05:39 02:00   RatePay Invoice  40.0   failed
2022-04-27 00:07:22 02:00   RatePay Invoice 104.0   failed
2022-04-27 00:07:38 02:00   RatePay Invoice 778.0   failed
2022-04-27 00:12:51 02:00   RatePay Invoice 356.0   failed
2022-04-27 00:15:58 02:00   PayPal          136.0   open

df.groupby(['payment_method']).agg({'amount':['sum']}).style.format("{:,.2f}€")

yields this result:


                   amount
                   sum
payment_method   
Credit Card       3,995.00€
PayPal           22,704.86€
RatePay Invoice  35,064.77€

now doing the same groupby but with an added aggregate like count results in:

df.groupby(['payment_method']).agg({'amount':['sum','count']}).style.format("{:,.2f}€")

                                   amount
                                sum count
payment_method       
Credit Card              3,995.00€   16.00€
PayPal                  22,704.86€  115.00€
RatePay Invoice         35,064.77€  143.00€

What I would like to have is a result like this:

                                   amount
                                sum count
payment_method       
Credit Card              3,995.00€   16
PayPal                  22,704.86€  115
RatePay Invoice         35,064.77€  143

I tried variations of these:

pd.pivot_table(
    df,
    index=['payment_method'],
    values=['amount'],
    aggfunc=['sum', 'count'],
    dropna=False,
    fill_value=0, margins=True).style.format({"amount.sum":"{:,.2f}€"})

df.groupby(['payment_method']).agg({'amount':['sum','count']}).style.format({"sum":"{:,.2f}€"})

but they only reformatted the sum column

                           amount
                        sum count
payment_method       
Credit Card      3995.000000     16
PayPal          22704.860000    115
RatePay Invoice 35064.770000    143

CodePudding user response:

What you have is a column MultiIndex. To access a single column in a MultiIndex you can use a tuple. Here I formatted both the sum and count column separately to highlight this idea.

(
    df.groupby(['payment_method'])
    .agg({'amount':['sum', 'count']})
    .style.format({
        ('amount', 'sum'): "{:,.2f}€",
        ('amount', 'count'): "{:03d}"
    })
)

enter image description here

Alternatively, you can avoid a MultiIndex by performing a single column selection on your groupby and passing your agg methods as a list (instead of a dictionary). Then you can apply formats on each column without a tuple:

(
    df.groupby(['payment_method'])
    ['amount'].agg(['sum', 'count'])
    .style.format({
        'sum': "{:,.2f}€",
        'count': "{:03d}"
    })
)

enter image description here

CodePudding user response:

You can try reset_index(), so the count column is acessible

df.groupby(['payment_method']).agg({'amount':['sum','count']}).reset_index().style.format({"count":"{:,.2f}€"})

If you insist, on having the amount as a column index.

df.groupby(['payment_method']).agg({'amount':['sum','count']}).stack().style.format({"count":"{:,.2f}€"})

After that, unstack the resulting df

  • Related