Home > other >  How to calculate the minimum and maximum frequency from a series with pandas
How to calculate the minimum and maximum frequency from a series with pandas

Time:10-21

I am working on a method to determine the minimum and maximum frequencies for a dataset. The method value_counts() returns the distinct values and frequencies for the data. I tried reviewing the documentation listed here, but it does not solve my problem. My goal is to

  1. Determine the maximum value in the set of distinct values.
  2. Determine the frequency associated with the maximum value from the dataset.
  3. Determine the minimum value in set of of distinct values.
  4. Determine the frequency associated with the minimum value from the dataset.

For example,

Sample input data

A1,A2,A3,Class
2,0.4631338,1.5,3
8,0.7460648,3.0,3
6,0.264391038,2.5,2
5,0.4406713,2.3,1
2,0.410438159,1.5,3
2,0.302901816,1.5,2
6,0.275869396,2.5,3
8,0.084782428,3.0,3
2,0.53226533,1.5,2
8,0.070034818,2.9,1
2,0.668631847,1.5,2
2    42
8    24
5    20
6    10
7     2
4     1
3     1

maxValue = 8, maxF = 24 minValue = 2, minF = 42

Expected: maxf returns the maxf frequency for the dataset, minf returns the minimum frequency for the dataset

Actual: I'm hung up on processing the frequency from value counts.

I've written a program to process the dataset

def main():
    s = pd.read_csv('A1-dm.csv')
    print("******************************************************")
    print("Entropy Discretization                         STARTED")
    s = entropy_discretization(s)
    print("Entropy Discretization                         COMPLETED")

def entropy_discretization(s):

    I = {}
    i = 0
    n = s.nunique()['A1']
    print("******************")
    print("calculating maxf")
    maxf(s['A1'])
    print("******************")

def maxf(s):
    print(s.value_counts())


def minf(s):
    print(s.value_counts())

Any help with this would be greatly appreciated. I

CodePudding user response:

Us Series.idxmax and Series.idxmin, if necessary output Series use Series.agg:

s = df['Class'].value_counts()
print (s)
3    5
2    4
1    2
Name: Class, dtype: int64

print (s.agg(['max','idxmax','min','idxmin']))
max       5
idxmax    3
min       2
idxmin    1
Name: Class, dtype: int64

Separately:

print (s.max(), s.idxmax(), s.min(), s.idxmin())
5 3 2 1
  • Related