Home > Blockchain >  How to remove additional index when using .mean(), .median(), .mode() in python on a pandas datafram
How to remove additional index when using .mean(), .median(), .mode() in python on a pandas datafram

Time:04-22

I am calculating the mode/median/mean of pandas df columns using .mean(), .median(), .mode() but when doing so an index appears in some of the results:

def largeStats(dataframe):
    dataframe.drop(dataframe.index[dataframe['large_airport'] != 'Y'], inplace=True)
    mean = dataframe['frequency_mhz'].mean()
    mode = dataframe['frequency_mhz'].mode()
    median = dataframe['frequency_mhz'].median()

    print("The mean freq of large airports is", mean)
    print("The most common freq of large airports is", mode)
    print("The middle freq of large airports is", median)

print(largeStats(df))

returns:

The mean freq of large airports is 120.00752293577986
The most common freq of large airports is 0    121.75
1    122.10
dtype: float64
The middle freq of large airports is 121.85
None

I want it to simply return the number for each:

The mean freq of large airports is 120.00752293577986

The most common freq of large airports is 121.75 & 122.10

The middle freq of large airports is 121.85

I know the indexing is in place due to 2 mode values but how would I remove that indexing?

CodePudding user response:

This would fix it,

mode = dataframe['frequency_mhz'].mode().values[0]

The mode() function gives back a pandas series. So this would allow you to access the item in that series.

CodePudding user response:

You can turn a pandas into a numpy array using the .values property:

mode = dataframe['frequency_mhz'].mode().values

should give you what you want.

CodePudding user response:

Because Series.mode can return one or more values, need filter first value for scalar:

The mode is the value that appears most often. There can be multiple modes.

print("The most common freq of large airports is", mode.iat[0])
  • Related