Home > Mobile >  Ignore the processing errors in pandas data manipulations
Ignore the processing errors in pandas data manipulations

Time:07-16

I am trying to run a pandas merge on multiple files and trying to ignore the file not found error, but there is no output after running the code.

import pandas as pd
try:
    df_t = pd.read_csv('C:\\...\\aa_kk.CSV',dtype=str)
    df_u = pd.read_csv('C:\\....\\bb_jj.CSV',dtype=str)
    df_t_e = pd.DataFrame(df_t,columns=['CO','MD','PS','PE','PO'])
    df_u_e = pd.DataFrame(df_u,columns=['CO','MD','PS','PE','PO'])
    merge_tu = [df_t_e,df_u_e]
    result_tu = pd.concat(merge_tu)
    print(result_tu)
except Exception:
    print('not found')

I am expecting the data from df_u to be printed since there is no file exist for df_t. But, nothing is printed after the execution except the "not found".

CodePudding user response:

This is not how we merge dataframes in python.

The correct way is using DataFrame.merge(). Here's an example:

import pandas as pd

df1 = pd.DataFrame({'df1name': ['foo', 'bar', 'baz', 'foo'], 'value': [1, 2, 3, 5]})

df2 = pd.DataFrame({'df2name': ['foo', 'bar', 'baz', 'foo'], 'value': [5, 6, 7, 8]})

df1.merge(df2, left_on='df1name', right_on='df2name')

BUT in your case, the pd.read_csv is inside the try and because the file not exists, an exception occurs and the message "not found" is printed.

CodePudding user response:

From a list of files, create a list of DataFrames. When there's an error, it simply won't be added to the list. At the end, concat them together.

files = ['C:\\...\\aa_kk.CSV', 'C:\\....\\bb_jj.CSV']
dfs = []
for file in files:
    try:
        df = pd.read_csv(file, dtype=str, names=['CO','MD','PS','PE','PO'])
        dfs.append(df)
    except FileNotFoundError:
        print(f'{file} not found')

df = pd.concat(dfs)   
  • Related