Home > Software design >  Compare elements in 2 columns in a dataframe to 2 input values
Compare elements in 2 columns in a dataframe to 2 input values

Time:04-17

I am trying to compare values in 2 columns of a dataframe from the input and check for equality:

My input is a list like:

ABC BCD
QWE XYZ
MNO PQR

My dataframe looks like:

St1  St2

ABC  BCD
PQR  XYZ
MNO  PQR

I want to check of the input is present in my dataframe. And if yes, then corresponding to each of the values, I append a "Yes" or "No" on the basis of whether it is present or not.

I am trying to do this:

for i in range(len(pairs)):
    if df_final['Stock1']==pairs[i][0]:
        df["corr"] = "Yes"

This doesnt work and give the following error:

The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

How can it be rectified?

I have tried this as well:

k = pairs[0]
df[(df[['St1','St2']].values == k).all(axis=1)]

where pairs is the input

CodePudding user response:

This could be a solution for you based on this post. However, you need to make the column names consistent in both data sets:

import pandas as pd
import numpy as np


df = pd.merge(df1, df2, on=['St1','St2'], how='left', indicator='Exist')
df1['corr'] = np.where(df['Exist'] == 'both', True, False)
df1

   St1  St2   corr
0  ABC  BCD   True
1  QWE  DAS  False
2  MNO  PQR   True

Data

df1 = pd.DataFrame({'St1': ['ABC', 'QWE', 'MNO'], 'St2': ['BCD', 'DAS', 'PQR']})
df2 = pd.DataFrame({'St1': ['ABC', 'PQR', 'MNO'], 'St2': ['BCD', 'XYZ', 'PQR']})
  • Related