Python one pivot table for different type of aggregation-CodePudding

I have an input df like:

category	name	cost	cost2
Tier1	Tina	10	5
Tier1	Jessy	20	15
Tier1	Tina	30	10

Need output:

category	count distinct name	sum of cost	avg of cost2
Tier1	2	60	10

It is similar to SQL: select category, count(distinct name), sum(cost), avg(cost2) group by category. I am looking for the equivalent easy way to do it in python, preferably in one statement, if not two. My actual use case has 30 columns so need a way I can simply parse the code.

CodePudding user response：

Try:

x = df.groupby("category", as_index=False).agg(
    count_distinct_name=("name", "nunique"),
    sum_of_cost=("cost", "sum"),
    avg_of_cost2=("cost2", "mean"),
)
x.columns = x.columns.str.replace("_", " ")
print(x)

Prints:

  category  count distinct name  sum of cost  avg of cost2
0    Tier1                    2           60          10.0