I am learning python right now and I need help to complete a project. So, I have an excel file of 1M rows and I am trying to write a script that will fill out the "Account type" column based on the account number.
Here is my script but it really doesn't work. Please Help
for i in df_data['Account Number']:
if i <= 99 :
df_data['Account Type']='Default Value'
elif 100 <= i <= 1999 :
df_data['Account Type']='Assets'
elif 2000 <= i <= 2899 :
df_data['Account Type']='Liabilities'
elif 2900<= i <=2999 :
df_data['Account Type']='Shareholder’s Equity'
elif 3000<= i <=3999 :
df_data['Account Type']='Allocated charge out of expenses'
elif 4000<= i <=4999 :
df_data['Account Type']='Operating Expenses'
elif 5000<= i <=5199 :
df_data['Account Type']='Revenues-Interest Income'
elif 5200<= i <=5399 :
df_data['Account Type']='Expenses - Interest Expense'
elif 5400<= i <=5999 :
df_data['Account Type']='Revenues - Trading Gains & Losses'
elif 6000<= i <=6699 :
df_data['Account Type']='Revenues -Other (Not Subject to GST)'
elif 6700<= i <=6799 :
df_data['Account Type']='Revenue - Contra- Insurance Claims'
else:
print(N/A)
CodePudding user response:
You can use pandas.cut
like this:
df['Account Type'] = pd.cut(
df_data['Account Number'],
bins=[0, 99, 1999, 2899, 2999, 3999, 4999, 5199, 5399, 5999, 6699, 6799],
labels=[
'Default Value',
'Assets',
'Liabilities',
'Shareholder’s Equity',
'Allocated charge out of expenses',
'Operating Expenses',
'Revenues-Interest Income',
'Expenses - Interest Expense',
'Revenues - Trading Gains & Losses',
'Revenues -Other (Not Subject to GST)',
'Revenue - Contra- Insurance Claims',
]
)
CodePudding user response:
You can use pd.cut
:
import pandas as pd
import numpy as np
bins = [0, 100, 2000, 2900, 3000, 4000, 5000, 5200, 5400, 6000, 6700, np.inf]
labels = ['Default Value', 'Assets', 'Liabilities', 'Shareholder’s Equity',
'Allocated charge out of expenses', 'Operating Expenses',
'Revenues-Interest Income', 'Expenses - Interest Expense',
'Revenues - Trading Gains & Losses', 'Revenues -Other (Not Subject to GST)',
'Revenue - Contra- Insurance Claims', 'N/A']
df_data['Account Type'] = pd.cut(df_data['Account Number'], bins=bins, labels=labels,
right=True, include_lowest=True)
Sample output:
>>> df_data
Account Number Account Type
0 0 Default Value
1 1000 Assets
2 2000 Assets
3 3000 Shareholder’s Equity
4 4000 Allocated charge out of expenses
5 5000 Operating Expenses
6 6000 Revenues - Trading Gains & Losses
7 7000 N/A