I have a table like this-
Hotel Earning
Abu 1000
Zain 400
Show 500
Zint 300
Abu 500
Zain 700
Abu 500
Abu 500
Abu 800
Abu 1600
Show 1300
Zint 600
Using Panda, How to group by hotel and calculate the min, median and max of the earning on each hotel. And at the end print the aggregates values Hotel name "Abu".
Output:
[500.0, 650.0, 1600.0]
CodePudding user response:
Pandas DataFrame aggregate() Method The aggregate() method allows you to apply a function or a list of function names to be executed along one of the axis of the DataFrame, default 0, which is the index (row) axis. Note: the agg() method is an alias of the aggregate() method.
CodePudding user response:
import pandas as pd
# Read the data into a Pandas DataFrame
df = pd.read_csv('hotel_earnings.csv')
# Group the data by hotel
hotels = df.groupby('Hotel')
# Calculate the min, median, and max of the earning for each hotel
earnings = hotels['Earning'].agg(['min', 'median', 'max'])
# Print the aggregated values for the hotel named "Abu"
print(earnings.loc['Abu'])
This code reads the data from the hotel_earnings.csv file into a Pandas DataFrame, groups the data by hotel, and calculates the minimum, median, and maximum earning for each hotel. It then prints the aggregated values for the hotel named "Abu"
CodePudding user response:
Use agg
:
df.groupby('Hotel').agg([('Min' , 'min'), ('Max', 'max'), ('Median', 'median')])
# Out:
# Earning
# Min Max Median
# Hotel
# Abu 500 1600 650.0
# Show 500 1300 900.0
# Zain 400 700 550.0
# Zint 300 600 450.0
For more statistics you can also use describe()
df.groupby('Hotel').describe()
# Out:
# Earning
# count mean std min 25% 50% 75% max
# Hotel
# Abu 6.0 816.666667 435.507367 500.0 500.0 650.0 950.0 1600.0
# Show 2.0 900.000000 565.685425 500.0 700.0 900.0 1100.0 1300.0
# Zain 2.0 550.000000 212.132034 400.0 475.0 550.0 625.0 700.0
# Zint 2.0 450.000000 212.132034 300.0 375.0 450.0 525.0 600.0