Home > other >  How to apply a function over a list of unique ids?
How to apply a function over a list of unique ids?

Time:08-18

I am trying to apply a function, that calculates a max value, over a list of ids and save them in one file using another function. Is this right way to do it? Because I am getting redundant results.

data1

animals_age1 = pd.DataFrame({'Animal': ['Falcon', 'Falcon','Falcon', 'Falcon', 'Falcon'],
                   'Age': [10, 20, 30, 40, 50]})

function1 (calculates max)

def function_1(df):
    df = df[df.Age >=0]
    return df.groupby(['Animal'])\
.apply(lambda x:pd.Series({'Age_max':x.Age.max()})).reset_index()

data2

animals_age2 = pd.DataFrame({'Animal': ['Falcon', 'Falcon','Falcon', 'Falcon', 'Falcon',
                                      'Parrot', 'Parrot','Parrot', 'Parrot', 'Parrot'],
                   'Age': [10, 20, 30, 40, 50, 10, 20, 30, 40, 60]})

function2 (calculates max for a list of unique ids)

def function_2(df):
    
    results = []
    
    for id in df['Animal'].unique():
        results.append(function_1(df))
        
    results = pd.concat(results, axis=0)
    
    return results

CodePudding user response:

Call function for both DataFrames separately, function aggregate by Animal, so not necessary looping by unique values of column Animal:

def function_1(df):
    return df[df.Age >=0].groupby('Animal', as_index=False).agg(Age_max=('Age','max'))
    
df1 = function_1(animals_age1)
print (df1)
   Animal  Age_max
0  Falcon       50

df1 = function_1(animals_age2)
print (df1)
   Animal  Age_max
0  Falcon       50
1  Parrot       60

EDIT:

If really need second function filter column Animal by unique value id:

def function_2(df):
    
    results = []
    
    for id in df['Animal'].unique():
        results.append(function_1(df[df['Animal'].eq(id)]))
        
    results = pd.concat(results, axis=0)
    
    return results

df2 = function_2(animals_age2)
  • Related