I am new to using Python and I am trying to use pandas to return the value of a name column for the name which has the maximum average grouped value for every numeric column.
Using the Pokemon dataset as an example, the below code loads the data.
import pandas as pd
url = "https://raw.githubusercontent.com/UofGAnalyticsData/DPIP/main/assesment_datasets/assessment3/Pokemon.csv"
df4 = pd.read_csv(url)
Then these next lines of code group by Type 1 and return the mean average outputs for every numeric attribute once grouped.
df4.groupby("Type 1")[["Total", "HP", "Attack", "Defense", "Sp. Atk", "Sp. Def", "Speed"]].agg("mean")
I want to modify this code so that from the resulting table, it outlines the name of the "Type 1" which has the highest average Total, HP, Attack and so on...
The below code gives me the numeric maximums, but I also want to return the name of Type 1 for which each maximum belongs to.
df4.groupby("Type 1")[["Total", "HP", "Attack", "Defense", "Sp. Atk", "Sp. Def", "Speed"]].agg("mean").agg("max")
How would I do this succinctly using pandas? Thanks.
CodePudding user response:
You can just add idxmax
in the agg()
method :
df4.groupby("Type 1")[["Total", "HP", "Attack", "Defense", "Sp. Atk", "Sp. Def", "Speed"]].agg("mean").agg(["max", "idxmax"])