I have the Data frame DF1:
Value | Path | Result |
---|---|---|
val1 | C:\file1.xlsx | True |
val2 | C:\file2.xlsx | False |
I need to check for each row if value is in the excel file at the specific row path and update Result accordingly.
CodePudding user response:
Your data table
import pandas as pd
import numpy as np
def create_dataframe():
data = {'Value': [1,2,3,4,5],
'Path': ['C:\file1.xlsx','C:\file2.xlsx','C:\file3.xlsx','C:\file4.xlsx','C:\file5.xlsx'],
'Result': [True, False, True, False, True]}
df = pd.DataFrame(data)
return df
#checking if the value exists in the excel file
#if it does, return True, else return False
def check_value(df):
for index, row in df.iterrows():
try:
df.loc[index, 'Result'] = pd.read_excel(row['Path'], header=None).isin([row['Value']]).any()
except:
df.loc[index, 'Result'] = False
return df
CodePudding user response:
This will check each excel file if the Value is in a column named Value:
import pandas as pd
data = {'Value': ['val1','val2'],
'Path': ['/home/bera/Documents/test1.xlsx','/home/bera/Documents/test2.xlsx']}
df = pd.DataFrame(data)
yesno =[]
for i in df.index:
tempdf = pd.read_excel(df.iloc[i].Path)
yesno.append(df.iloc[i].Value in set(tempdf.Value))
df['Result'] = yesno