I am trying to apply the function on multiple data frames. I created a list for the data frames. If the ranking is less than 100, high performance column would be assigned values copied over from the ranking column and if the ranking is between 100 and 200, the average column would be assigned the values copied over from the ranking column. If the ranking is between 200 and 300, the lower performance column gets assigned values copied from the ranking column. I do not get any error messages when I run the script but the function does not get applied to the data frames. Any suggestions would be helpful.
for file in tests: #tests would be a list of data frame
def func (file):
if (file['ranking']) < 100:
(file['ranking']) == (file['High Performance'])
elif (file['ranking']) > 100 & (file['ranking'] < 200):
(file['ranking'])== (file['Average'])
elif (file ['ranking']) > 200& (file['ranking'] < 300):
(file['ranking']) == (file ['Low Performance'])
else:
return ''
file['High Performance'] = file.apply(func, axis=1)
file['Average'] = file.apply(func, axis=1)
file['Low Performance'] = file.apply(func, axis=1)
CodePudding user response:
You get no error bc your code is syntactically correct. But watch out for the logic. I hope the below code change helps:
def func (file):
if (file['ranking']) < 100:
(file['ranking']) == (file['High Performance'])
elif (file['ranking']) > 100 & (file['ranking'] < 200):
(file['ranking'])== (file['Average'])
elif (file ['ranking']) > 200& (file['ranking'] < 300):
(file['ranking']) == (file ['Low Performance'])
else:
return ''
for file in tests: #tests would be a list of data frame
file['High Performance'] = file.apply(func, axis=1)
file['Average'] = file.apply(functionss, axis=1)
file['Low Performance'] = file.apply(functionss, axis=1)
CodePudding user response:
I would suggest considering the processing option without apply
. A frame is passed to the function, and the entire processed column is returned
import numpy as np
import pandas as pd
def func(file):
result = file['ranking'].copy()
result[:] = ''
result.loc[mask] = file.loc[(mask := file['ranking'].lt(100)), 'High Performance']
result.loc[mask] = file.loc[(mask := file['ranking'].between(100, 200, inclusive='left')), 'Average']
result.loc[mask] = file.loc[(mask := file['ranking'].between(200, 300, inclusive='both')), 'Low Performance']
return result
print('\nOriginal frames:\n')
lst = [] # Data preparation
for _ in range(2): # adjust
df = pd.DataFrame(
{'ranking': np.random.randint(0, 400, 100), 'High Performance': np.random.randint(1000, 10000, 100),
'Average': np.random.randint(10000, 100000, 100), 'Low Performance': np.random.randint(100000, 1000000, 100)})
lst.append(df)
print(df.head(5))
print('\nProcessed frames:\n')
for i, file in enumerate(lst):
lst[i]['ranking'] = func(file)
print(lst[i].head(5))
Original frames:
ranking High Performance Average Low Performance
0 340 7674 53049 893702
1 58 6838 38181 653512
2 313 2383 66811 794135
3 260 3930 24911 968317
4 377 6543 80905 599571
ranking High Performance Average Low Performance
0 223 6044 77461 237517
1 250 6128 24633 112060
2 396 3701 26695 767052
3 261 9031 64877 415611
4 313 1298 52726 782069
Processed frames:
ranking High Performance Average Low Performance
0 7674 53049 893702
1 6838 6838 38181 653512
2 2383 66811 794135
3 968317 3930 24911 968317
4 6543 80905 599571
ranking High Performance Average Low Performance
0 237517 6044 77461 237517
1 112060 6128 24633 112060
2 3701 26695 767052
3 415611 9031 64877 415611
4 1298 52726 782069
CodePudding user response:
I think you just needed to indent your code correctly.
And I would you a map.