Home > Software engineering >  How to return appended data frame using a function in Python?
How to return appended data frame using a function in Python?

Time:09-16

I would like to return each data frame from each URL appended into one single data frame. When I print it within the function, I get the result I desire. The problem is when I try assign a variable to the data frame, it only adds the final data frame. Running this function prints my desired result:

import pandas as pd
urllist = ['https://basketball.realgm.com/nba/boxscore/2022-04-09/Indiana-at-Philadelphia/388705', 'https://basketball.realgm.com/nba/boxscore/2022-04-09/New-Orleans-at-Memphis/388704', 'https://basketball.realgm.com/nba/boxscore/2022-04-09/Golden-State-at-San-Antonio/388706', 'https://basketball.realgm.com/nba/boxscore/2022-04-09/Sacramento-at-LA-Clippers/388703']
def Boxscore(URL):
    for x in URL:
        box_list = pd.read_html(x)
        box1 = box_list[3]
        box2 = box_list[4]
        fullbox = pd.concat([box1, box2])
        print(fullbox)

Boxscore(urllist)

But when I try to assign it a value, it only prints the final data frame, instead of all of them together.

import pandas as pd
urllist = ['https://basketball.realgm.com/nba/boxscore/2022-04-09/Indiana-at-Philadelphia/388705', 'https://basketball.realgm.com/nba/boxscore/2022-04-09/New-Orleans-at-Memphis/388704', 'https://basketball.realgm.com/nba/boxscore/2022-04-09/Golden-State-at-San-Antonio/388706', 'https://basketball.realgm.com/nba/boxscore/2022-04-09/Sacramento-at-LA-Clippers/388703']
def Boxscore(URL):
    for x in URL:
        box_list = pd.read_html(x)
        box1 = box_list[3]
        box2 = box_list[4]
        fullbox = pd.concat([box1, box2])
    return fullbox

fullboxscore = Boxscore(urllist)
print(fullboxscore)

How can I append each data frame into one, and name that new data frame as a variable? Please help, thanks!

CodePudding user response:

Try creating an empty list to append to and then concat

def Boxscore(URL: list) -> pd.DataFrame:
    dfs = [] # empty list
    for x in URL:
        box_list = pd.read_html(x)
        box1 = box_list[3]
        box2 = box_list[4]
        fullbox = pd.concat([box1, box2])
        dfs.append(fullbox) # append frame to list
        
    return pd.concat(dfs).reset_index(drop=True) # concat frames and return

# call you function and assign it to a variable 
fullboxscore = Boxscore(urllist)
  • Related