Home > front end >  Python Pandas DataFrame User Defined Function Transformations
Python Pandas DataFrame User Defined Function Transformations

Time:01-04

I have several DataFrames which I am in the process of cleaning the data. The following code works independently (outside of a function), however, I have to apply it to many DataFrames and want to clean this process via a user defined function. Can you please help to fix the following so that it can be used for all of my dataframes.

def format_df(df):
    df.columns = df.columns.str
    df.dropna(thresh=1, axis='columns',inplace = True)
    df.dropna(thresh=80,axis=0,inplace = True)    
    df.columns = df.iloc[0]
    df = df.iloc[1:].reset_index(drop=True)
    df.columns = df.columns.str.replace(' ','',regex=False)
    df.columns = df.columns.str.replace('($)','',regex=False)
    df.columns = df.columns.str.replace('(Y/N)','Flag',regex=False)
    df.columns = df.columns.str.replace('(x)','',regex=False)
    df.columns = df.columns.str.replace('-','',regex=False)
    return df

CodePudding user response:

The line df.columns = df.columns.str is not going to run because df.columns.str is a string method and df.columns is an index. Instead you can use the astype method:

df.columns = df.columns.astype(str)
  • Related