Home > Back-end >  Pandas- for each row in Data frame check if value exist in excel file
Pandas- for each row in Data frame check if value exist in excel file

Time:11-14

I have the Data frame DF1:

Value Path Result
val1 C:\file1.xlsx True
val2 C:\file2.xlsx False

I need to check for each row if value is in the excel file at the specific row path and update Result accordingly.

CodePudding user response:

Your data table

import pandas as pd
import numpy as np

def create_dataframe():
    data = {'Value': [1,2,3,4,5],
            'Path': ['C:\file1.xlsx','C:\file2.xlsx','C:\file3.xlsx','C:\file4.xlsx','C:\file5.xlsx'],
            'Result': [True, False, True, False, True]}
    df = pd.DataFrame(data)
    return df

#checking if the value exists in the excel file
#if it does, return True, else return False
def check_value(df):
    for index, row in df.iterrows():
        try:
            df.loc[index, 'Result'] = pd.read_excel(row['Path'], header=None).isin([row['Value']]).any()
        except:
            df.loc[index, 'Result'] = False
    return df

CodePudding user response:

This will check each excel file if the Value is in a column named Value:

import pandas as pd

data = {'Value': ['val1','val2'],
        'Path': ['/home/bera/Documents/test1.xlsx','/home/bera/Documents/test2.xlsx']}
df = pd.DataFrame(data)


yesno =[]
for i in df.index:
    tempdf = pd.read_excel(df.iloc[i].Path)
    yesno.append(df.iloc[i].Value in set(tempdf.Value))

df['Result'] = yesno
  • Related