Home > Software engineering >  Applying a function for multiple dataframes
Applying a function for multiple dataframes

Time:11-20

I am trying to apply the function on multiple data frames. I created a list for the data frames. If the ranking is less than 100, high performance column would be assigned values copied over from the ranking column and if the ranking is between 100 and 200, the average column would be assigned the values copied over from the ranking column. If the ranking is between 200 and 300, the lower performance column gets assigned values copied from the ranking column. I do not get any error messages when I run the script but the function does not get applied to the data frames. Any suggestions would be helpful.


for file in tests: #tests would be a list of data frame
    def func (file):
    
        if (file['ranking']) < 100:
            (file['ranking']) == (file['High Performance'])
        elif (file['ranking']) > 100 & (file['ranking'] < 200):
            (file['ranking'])== (file['Average'])
        elif (file ['ranking']) > 200& (file['ranking'] < 300):
            (file['ranking']) == (file ['Low Performance'])
        else: 
            return ''

file['High Performance'] = file.apply(func, axis=1)
file['Average'] = file.apply(func, axis=1)
file['Low Performance'] = file.apply(func, axis=1)

CodePudding user response:

You get no error bc your code is syntactically correct. But watch out for the logic. I hope the below code change helps:

def func (file):
    if (file['ranking']) < 100:
        (file['ranking']) == (file['High Performance'])
    elif (file['ranking']) > 100 & (file['ranking'] < 200):
        (file['ranking'])== (file['Average'])
    elif (file ['ranking']) > 200& (file['ranking'] < 300):
        (file['ranking']) == (file ['Low Performance'])
    else: 
        return ''
            
for file in tests: #tests would be a list of data frame
    file['High Performance'] = file.apply(func, axis=1)
    file['Average'] = file.apply(functionss, axis=1)
    file['Low Performance'] = file.apply(functionss, axis=1)

CodePudding user response:

I would suggest considering the processing option without apply. A frame is passed to the function, and the entire processed column is returned

import numpy as np
import pandas as pd

def func(file):
    result = file['ranking'].copy()
    result[:] = ''
    result.loc[mask] = file.loc[(mask := file['ranking'].lt(100)), 'High Performance']
    result.loc[mask] = file.loc[(mask := file['ranking'].between(100, 200, inclusive='left')), 'Average']
    result.loc[mask] = file.loc[(mask := file['ranking'].between(200, 300, inclusive='both')), 'Low Performance']
    return result


print('\nOriginal frames:\n')
lst = [] # Data preparation
for _ in range(2): # adjust
    df = pd.DataFrame(
        {'ranking': np.random.randint(0, 400, 100), 'High Performance': np.random.randint(1000, 10000, 100),
         'Average': np.random.randint(10000, 100000, 100), 'Low Performance': np.random.randint(100000, 1000000, 100)})
    lst.append(df)
    print(df.head(5))

print('\nProcessed frames:\n')
for i, file in enumerate(lst):
    lst[i]['ranking'] = func(file)
    print(lst[i].head(5))
Original frames:

   ranking  High Performance  Average  Low Performance
0      340              7674    53049           893702
1       58              6838    38181           653512
2      313              2383    66811           794135
3      260              3930    24911           968317
4      377              6543    80905           599571
   ranking  High Performance  Average  Low Performance
0      223              6044    77461           237517
1      250              6128    24633           112060
2      396              3701    26695           767052
3      261              9031    64877           415611
4      313              1298    52726           782069

Processed frames:

  ranking  High Performance  Average  Low Performance
0                      7674    53049           893702
1    6838              6838    38181           653512
2                      2383    66811           794135
3  968317              3930    24911           968317
4                      6543    80905           599571
  ranking  High Performance  Average  Low Performance
0  237517              6044    77461           237517
1  112060              6128    24633           112060
2                      3701    26695           767052
3  415611              9031    64877           415611
4                      1298    52726           782069

CodePudding user response:

I think you just needed to indent your code correctly.

And I would you a map.

  • Related