Home > other >  VLOOKUP FOR loop in python
VLOOKUP FOR loop in python

Time:07-18

I am learning python right now and I need help to complete a project. So, I have an excel file of 1M rows and I am trying to write a script that will fill out the "Account type" column based on the account number.

SCREENSHOT

Here is my script but it really doesn't work. Please Help

for i in df_data['Account Number']:
if i <= 99 :
    df_data['Account Type']='Default Value'
elif 100 <= i <= 1999 :
    df_data['Account Type']='Assets'
elif 2000 <= i <= 2899 :
    df_data['Account Type']='Liabilities'
elif 2900<= i <=2999 :                              
    df_data['Account Type']='Shareholder’s Equity'
elif 3000<= i <=3999 :                   
    df_data['Account Type']='Allocated charge out of expenses'
elif 4000<= i <=4999 :                                
    df_data['Account Type']='Operating Expenses'
elif 5000<= i <=5199 :                          
    df_data['Account Type']='Revenues-Interest Income'
elif 5200<= i <=5399 :                        
    df_data['Account Type']='Expenses - Interest Expense'
elif 5400<= i <=5999 :                  
    df_data['Account Type']='Revenues - Trading Gains & Losses'
elif 6000<= i <=6699 :               
    df_data['Account Type']='Revenues -Other (Not Subject to GST)'
elif 6700<= i <=6799 :                 
    df_data['Account Type']='Revenue - Contra- Insurance Claims'
else:
    print(N/A)

CodePudding user response:

You can use pandas.cut like this:

df['Account Type'] = pd.cut(
    df_data['Account Number'],
    bins=[0, 99, 1999, 2899, 2999, 3999, 4999, 5199, 5399, 5999, 6699, 6799],
    labels=[
        'Default Value',
        'Assets',
        'Liabilities',                       
        'Shareholder’s Equity',
        'Allocated charge out of expenses',
        'Operating Expenses',
        'Revenues-Interest Income',
        'Expenses - Interest Expense',
        'Revenues - Trading Gains & Losses',
        'Revenues -Other (Not Subject to GST)',
        'Revenue - Contra- Insurance Claims',
    ]
)

CodePudding user response:

You can use pd.cut:

import pandas as pd
import numpy as np

bins = [0, 100, 2000, 2900, 3000, 4000, 5000, 5200, 5400, 6000, 6700, np.inf]

labels = ['Default Value', 'Assets', 'Liabilities', 'Shareholder’s Equity',
          'Allocated charge out of expenses', 'Operating Expenses',
          'Revenues-Interest Income', 'Expenses - Interest Expense',
          'Revenues - Trading Gains & Losses', 'Revenues -Other (Not Subject to GST)',
          'Revenue - Contra- Insurance Claims', 'N/A']

df_data['Account Type'] = pd.cut(df_data['Account Number'], bins=bins, labels=labels,
                                 right=True, include_lowest=True)

Sample output:

>>> df_data
   Account Number                       Account Type
0               0                      Default Value
1            1000                             Assets
2            2000                             Assets
3            3000               Shareholder’s Equity
4            4000   Allocated charge out of expenses
5            5000                 Operating Expenses
6            6000  Revenues - Trading Gains & Losses
7            7000                                N/A
  • Related