Home > OS >  Appending dataframe extracted from Page#2,3,4,5 ...to dataframe extracted from Page#1 of coingecko i
Appending dataframe extracted from Page#2,3,4,5 ...to dataframe extracted from Page#1 of coingecko i

Time:03-27

I'm practicing python pandas and trying to extract the first 500 coins from coingecko into one dataframe. Each page in coingecko has 100 coins. I am able to extract the dataframes from each page by changing the page# in the url..I tried using df1.append(df2) by using a for loop over the page numbers but that didn't work. What am i doing wrong?

I need one dataframe with all 500 coins

def get_coingecko_data(): 
    page_numbers = [1,2,3,4,5] 
    for n in page_numbers: 
        r = requests.get(f"https://www.coingecko.com/?page={n}") 
        df = pd.read_html(r.text)[0] 
        df2 = pd.DataFrame(df) 
        df2 = df2[["#", "Coin", "Price", "Mkt Cap"]] 
        if n == 1: 
            df_500 = df2.copy() 
        else:
            df_500.append(df4, ignore_index = True)

CodePudding user response:

You're missing the equal sign to update df_500 variable.

df_500 = df_500.append(df2, ignore_index=True)

By the way here is a cleaner version:

def get_coingecko_data():
    page_numbers = [1, 2, 3, 4, 5]
    df500 = pd.DataFrame()

    for n in page_numbers:
        r = requests.get(f"https://www.coingecko.com/?page={n}")
        dfPre = pd.read_html(r.text)[0]
        df500 = df500.append(dfPre[["#", "Coin", "Price", "Mkt Cap"]], ignore_index=True) 

CodePudding user response:

you have a couple of issues. First the last line has a typo, it should be df2 not df4.

The second problem you have is that contrary to the behaviour of the python append, the pandas.DataFrame.append does not change the object but rather returns the result of the append as such you would have to rewrite the last line as

df_500 = df_500.append(df2, ignore_index = True)

It is for this quite common confusion among others that append has been deprecated since v1.4 and it is now recommended to switch to pandas.concat

Last but not least your function lacks a return.

Fixing all of the above reults in the function

def get_coingecko_data(): 
    page_numbers = [1,2,3,4,5] 
    for n in page_numbers: 
        r = requests.get(f"https://www.coingecko.com/?page={n}") 
        df = pd.read_html(r.text)[0] 
        df2 = pd.DataFrame(df) 
        df2 = df2[["#", "Coin", "Price", "Mkt Cap"]] 
        if n == 1: 
            df_500 = df2.copy() 
        else:
            df_500 = pd.concat([df_500, df2], ignore_index = True)
    return df_500
  • Related