Home > Software engineering >  What is the appropriate way of outputting dataframe results from a function?
What is the appropriate way of outputting dataframe results from a function?

Time:12-20

I am pretty new to python and pandas and I am trying to create a function that will read four datasets and combine them into one dataframe. I can get the results I need if I do not try to wrap this all in a function, but I plan to create a similar dataframe for another four datasets, so I believe the function will be clean things up a bit.

Using the code below I get the following error ( NameError: name 'crime' is not defined )

# function to import datasets and combine them for grouped analysis
def Crime2020():

    # import datasets from 2020
    mayCrime=pd.read_csv('C://datasets/summer_comp/2020-05.csv')
    junCrime=pd.read_csv('C://datasets/summer_comp/2020-06.csv')
    julCrime=pd.read_csv('C://datasets/summer_comp/2020-07.csv')
    augCrime=pd.read_csv('C://datasets/summer_comp/2020-08.csv')

    # combine dataframes using concatenation
    frames = [mayCrime, junCrime, julCrime, augCrime]
    crime = pd.concat(frames)
    
    return crime

crime = Crime2020(crime)
crime.head()

It seems as though I am not accessing the function correctly but since I am new I dont quite understand why. I've tried a few different methods I've seen elsewhere but nothing seems to be working.

Any help will be greatly appreciated. No doubt I am just missing something simple.

CodePudding user response:

You're trying to pass a variable that doesn't exist to a function. crime = Crime2020(crime) should be crime = Crime2020() as the variable is being created inside the function and is not being passed from outside.

On a side note it is better convention to use capitalized naming for classes and snake case for function.

  • Related