Home > database >  Need to store my function's results as a dictionary value in Python Pandas
Need to store my function's results as a dictionary value in Python Pandas

Time:11-29

I have 2 functions that read a csv file and count the following as checks:

  1. number of rows in that csv
  2. number of rows that have a null value in the 'ID' column

I am trying to create a dataframe that looks like this

Checks Summary Findings
Check #1 Number of records on file function #1 results (Number of records on file: 10)
Check #2 Number of records missing an ID function #2 results (Number of records missing an ID: 2)

function 1 looks like this:

def function1():
    with open('data.csv') as file:
        record_number = len(list(file))
        print("Number of records on file:",record_number)
function1()

and outputs "Number of records on file: 10"

function 2 looks like this:

def function2():
    df = pd.read_csv('data.csv', low_memory=False)
    missing_id = df["IDs"].isna().sum()
    print("Number of records missing an ID:", missing_id)
function2()

and outputs "Number of records missing an ID: 2"

I attempt to create a dictionary first and create my dictionary

table = {
   'Checks' : ['Check #1', 'Check #2'],
    'Summary' : ['Number of records on file', 'Number of records missing an ID'],
    'Findings' : [function1, function2]
}
df = pd.DataFrame(table)
df

However, this is what the dataframe looks like:

Checks Summary Findings
Check #1 Number of records on file <function function1 at 0x7efd2d76a730>
Check #2 Number of records missing an ID <function2 at 0x7efd25cd0b70>

Is there any way to make it so that my Findings column outputs the actual results as seen above?

CodePudding user response:

The reason is that you're printing the function objects, and not their results:

function1 != function1()

So for your case you need:

table = {
   'Checks' : ['Check #1', 'Check #2'],
    'Summary' : ['Number of records on file', 'Number of records missing an ID'],
    'Findings' : [function1(), function2()]
}
df = pd.DataFrame(table)
df

Edit: Oh damn and I also missed what the other user commented. You definitely need to return a value from your functions as well :)

CodePudding user response:

You need to change your functions so they return values, not output them, that is do

def function1():
    with open('data.csv') as file:
        record_number = len(list(file))
        return record_number

and

def function2():
    df = pd.read_csv('data.csv', low_memory=False)
    return df["IDs"].isna().sum()

and call these functions like so

table = {
   'Checks' : ['Check #1', 'Check #2'],
    'Summary' : ['Number of records on file', 'Number of records missing an ID'],
    'Findings' : [function1(), function2()]
}
df = pd.DataFrame(table)
df

CodePudding user response:

For expected ouput add return with f-strings to both functions, in DataFrame call functions with parentheses:

def function1():
    with open('data.csv') as file:
        record_number = len(list(file))
        return f"function #1 results (Number of records on file: {record_number})")


def function2():
    df = pd.read_csv('data.csv', low_memory=False)
    missing_id = df["IDs"].isna().sum()
    return f"function #2 results (Number of records missing an ID: {missing_id})")


table = {
   'Checks' : ['Check #1', 'Check #2'],
    'Summary' : ['Number of records on file', 'Number of records missing an ID'],
    'Findings' : [function1(), function2()]
}
df = pd.DataFrame(table)

Solution with one function:

def function():
    with open('data.csv') as file:
        record_number = len(list(file))
        missing_id = df["IDs"].isna().sum()
        
        return [f"function #1 results (Number of records on file: {record_number})"),
                f"function #2 results (Number of records missing an ID: {missing_id})")]


table = {
   'Checks' : ['Check #1', 'Check #2'],
    'Summary' : ['Number of records on file', 'Number of records missing an ID'],
    'Findings' : function()
}
df = pd.DataFrame(table)
  • Related