I want to read a table form Wikipedia:
import pandas as pd
caption="Edit section: 2019 inequality-adjusted HDI (IHDI) (2020 report)"
df = pd.read_html('https://en.wikipedia.org/wiki/List_of_countries_by_inequality-adjusted_Human_Development_Index',match=caption)
df
But I got this errore: "ValueError: No tables found matching pattern 'Edit section: 2019 inequality-adjusted HDI (IHDI) (2020 report)'"
This method worked for table like below table:
caption = "Average daily maximum and minimum temperatures for selected cities in Minnesota"
df = pd.read_html('https://en.wikipedia.org/wiki/Minnesota', match=caption)
df
But I get confused for this one, how can I solved this problem?
CodePudding user response:
You have multiple problems here.
pandas
doesn't support https
, and there's no such caption that you're looking for.
Try this:
import pandas as pd
import requests
caption = "Table of countries by IHDI"
df = pd.read_html(
requests.get("https://en.wikipedia.org/wiki/List_of_countries_by_inequality-adjusted_Human_Development_Index").text,
match=caption,
)
print(df[0].head())
Output:
Rank Country ... 2019 estimates (2020 report)[4][5][6]
Rank Country ... Overall loss (%) Growth since 2010
0 1 Norway ... 6.1 0.021
1 2 Iceland ... 5.8 0.055
2 3 Switzerland ... 6.9 0.015
3 4 Finland ... 5.3 0.040
4 5 Ireland ... 7.3 0.066
[5 rows x 6 columns]
CodePudding user response:
import pandas as pd
df = pd.read_html('https://en.wikipedia.org/wiki/List_of_countries_by_inequality-adjusted_Human_Development_Index')
df[2]
Or if you wish to use match
argument
import pandas as pd
caption="Table of countries by IHDI"
df = pd.read_html('https://en.wikipedia.org/wiki/List_of_countries_by_inequality-adjusted_Human_Development_Index',match=caption)
df[0]
Returns