I am trying to run a pandas merge on multiple files and trying to ignore the file not found error, but there is no output after running the code.
import pandas as pd
try:
df_t = pd.read_csv('C:\\...\\aa_kk.CSV',dtype=str)
df_u = pd.read_csv('C:\\....\\bb_jj.CSV',dtype=str)
df_t_e = pd.DataFrame(df_t,columns=['CO','MD','PS','PE','PO'])
df_u_e = pd.DataFrame(df_u,columns=['CO','MD','PS','PE','PO'])
merge_tu = [df_t_e,df_u_e]
result_tu = pd.concat(merge_tu)
print(result_tu)
except Exception:
print('not found')
I am expecting the data from df_u
to be printed since there is no file exist for df_t
. But, nothing is printed after the execution except the "not found".
CodePudding user response:
This is not how we merge dataframes in python.
The correct way is using DataFrame.merge(). Here's an example:
import pandas as pd
df1 = pd.DataFrame({'df1name': ['foo', 'bar', 'baz', 'foo'], 'value': [1, 2, 3, 5]})
df2 = pd.DataFrame({'df2name': ['foo', 'bar', 'baz', 'foo'], 'value': [5, 6, 7, 8]})
df1.merge(df2, left_on='df1name', right_on='df2name')
BUT in your case, the pd.read_csv
is inside the try and because the file not exists, an exception occurs and the message "not found" is printed.
CodePudding user response:
From a list of files, create a list of DataFrames. When there's an error, it simply won't be added to the list. At the end, concat them together.
files = ['C:\\...\\aa_kk.CSV', 'C:\\....\\bb_jj.CSV']
dfs = []
for file in files:
try:
df = pd.read_csv(file, dtype=str, names=['CO','MD','PS','PE','PO'])
dfs.append(df)
except FileNotFoundError:
print(f'{file} not found')
df = pd.concat(dfs)