Home > front end >  Python Beautiful Soup get data from 2 URL's
Python Beautiful Soup get data from 2 URL's

Time:03-16

Good day, I am a student taking python classes. We are now learning about Beautiful Soup and I am having trouble extracting data from 2 tables as you will see in the code below:

import pandas as pd
import requests

list_of_urls = ['https://tradingeconomics.com/albania/gdp-growth-annual',
                'https://trdingeconomics.com/south-africa/gdp-growth-annual']

final_df = pd.DataFrame()

for i in lists_of_urls:
    table = pd.read_html(i, match='Related')
    for row in table:
        if row.loc['Related'] == 'GDP Annual Growth Rate':
            final_df.append(row)
        else:
            pass

CodePudding user response:

You don't need neither requests nor bs4. pd.read_html does the job.

list_of_urls = ['https://tradingeconomics.com/albania/gdp-growth-annual',
                'https://tradingeconomics.com/south-africa/gdp-growth-annual']

data = {}
for i in list_of_urls:
    country = i.split('/')[3]
    df = pd.read_html(i, match='Related')[0]
    data[country] = df.loc[df['Related'] == 'GDP Annual Growth Rate']

df = pd.concat(data)

Output:

>>> df
                               Related  Last  Previous     Unit Reference
albania      1  GDP Annual Growth Rate  6.99     18.38  percent  Sep 2021
south-africa 1  GDP Annual Growth Rate  1.70      2.90  percent  Dec 2021
  • Related