Home > other >  how to aggregate columns based on the value of others
how to aggregate columns based on the value of others

Time:05-11

If i had a dataframe such as this, how would i create aggragtes such as min,max and mean for each Port for each given year?

df1 = pd.DataFrame({'Year': {0: 2019, 1: 2019, 2: 2019, 3: 2019, 4:2019},'Port': {0: 'NORTH SHIELDS', 1: 'NORTH SHIELDS'  2: 'NORTH SHIELDS', 3: 'NORTH SHIELDS', 4: 'NORTH SHIELDS'},'Vessel capacity units': {0: 760.5, 1: 760.5, 2: 760.5, 3: 760.5, 4: 760.5},'Engine power': {0: 790.0, 1: 790.0, 2: 790.0, 3: 790.0, 4: 790.0},'Registered tonnage': {0: 516.0, 1: 516.0, 2: 516.0, 3: 516.0, 4: 516.0},'Overall length': {0: 45.0, 1: 45.0, 2: 45.0, 3: 45.0, 4: 45.0},'Value(£)': {0: 2675.81, 1: 62.98, 2: 9.67, 3: 527.02, 4: 2079.0}, 'Landed Weight (tonnes)': {0: 0.978,1: 0.0135, 2: 0.001, 3: 0.3198, 4: 3.832}})

df1

CodePudding user response:

IIUC

df.groupby(['PORT', 'YEAR'])['<WHATEVER COLUMN HERE>'].agg(['count', 'min', 'max', 'mean']) #groupys by 'PORT', 'YEAR' and finds the multiple arguments of count, min, max, and mean

CodePudding user response:

Without any kind of background information this questions is tricky. Would you want it for every year or just some given years?

To extract min/max/mean etc is quite straightforward. I assume that you have some kind of datafile and have extracted a df from there:

file = 'my-data.csv'  # the data file
df = pd.read_csv(file)

VALUE_I_WANT_TO_EXTRAXT = df['Column name']

Then for each port you can extract the min/max/mean data like this.

for i in range(Port):
    print( i, np.min(VALUE_I_WANT_TO_EXTRAXT) )

But, as I said. Without any kind of specifik knowledge about the problem it is hard to provide a solution

  • Related