Home > Software engineering >  Pandas: Check if value exists in another table
Pandas: Check if value exists in another table

Time:11-03

I'm working on Ethereum Fraud detection dataset, 0 denoting normal and 1 denoting fradulent

I have train_account.csv as

account flag
a17249 0
a03683 1
a22146 0

transactions.csv as

from_account to_account
a00996 b31499
a07890 a22146
a22146 b31504

I want to make a test_account.csv, where only accounts would be given and our task is to find if it is fradulent-1 or normal - 0

account
a27890
a03683
a22146

Rules I'm following to make below table

  • If an account is present in train_account.csv account column then a flag column with value that is present for that account in train_account.csv will be added if not then I go for checking if that account is present in trasnaction['from_account'] or trasaction['to_account'] if yes then add flag value for that account in test_account to be 0 else 1
account flag
a27890 1
a03683 1
a22146 0

I'm planning to add a flag column based on the above rule, not to add or remove extra rows

PS: I'm beginner and have no clue to do this, Thanks in advance

I tried looping in columns but not sure how to check and add it to the result. some thing like this:

for i in test_account['account']:
    if i in train_account['account']:
        test_account[i]['flag'] = train_account[i]['flag']
    elif i in trasnaction['from_account'] or trasnaction['to_account']:
        test_account[i]['flag'] = 1
    else:
        test_account[i]['flag'] = 0

CodePudding user response:

Try this:

import numpy as np
# step 1
train_accounts = train_account['account'].values
test_account['flag'] = np.where(test_account['account'].isin(train_accounts), train_account['flag'], test_account['flag'])
# step 2
accounts = np.append(transactions['from_account'].values,transactions['to_account'].values)
test_account['flag'] = np.where(test_account['account'].isin(accounts), 1, 0)
  • Related