I want to merge the table multipages from web using Pandas. I managed to create the table from one page, but I need to scrap the table from multipages.
I ran this code and it showed "in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 400:"
Any help will be much appreciated, thanks!
import pandas as pd
import ssl
ssl._create_default_https_context = ssl._create_unverified_context
dfs = []
url = 'https://www.emporis.com/city/100422/singapore-singapore/status/all-buildings/{}'
for i in range(1,3):
df = pd.read_html(url).format(i)
dfs.append(df)
print(dfs[0].head(15))
CodePudding user response:
I am not sure about the error but I can get the pages as below:
url = 'https://www.emporis.com/city/100422/singapore-singapore/status/all-buildings/'
for i in range(1,3):
df = pd.read_html(url str(i))
if that helps...
CodePudding user response:
str.format
needs to be called on the url template, not the dataframe
url = 'https://www.emporis.com/city/100422/singapore-singapore/status/all-buildings/{}'
for i in range(1,3):
df = pd.read_html(url.format(i))