Good day, I am a student taking python classes. We are now learning about Beautiful Soup and I am having trouble extracting data from 2 tables as you will see in the code below:
import pandas as pd
import requests
list_of_urls = ['https://tradingeconomics.com/albania/gdp-growth-annual',
'https://trdingeconomics.com/south-africa/gdp-growth-annual']
final_df = pd.DataFrame()
for i in lists_of_urls:
table = pd.read_html(i, match='Related')
for row in table:
if row.loc['Related'] == 'GDP Annual Growth Rate':
final_df.append(row)
else:
pass
CodePudding user response:
You don't need neither requests
nor bs4
. pd.read_html
does the job.
list_of_urls = ['https://tradingeconomics.com/albania/gdp-growth-annual',
'https://tradingeconomics.com/south-africa/gdp-growth-annual']
data = {}
for i in list_of_urls:
country = i.split('/')[3]
df = pd.read_html(i, match='Related')[0]
data[country] = df.loc[df['Related'] == 'GDP Annual Growth Rate']
df = pd.concat(data)
Output:
>>> df
Related Last Previous Unit Reference
albania 1 GDP Annual Growth Rate 6.99 18.38 percent Sep 2021
south-africa 1 GDP Annual Growth Rate 1.70 2.90 percent Dec 2021