I'm web scraping data and want to put it into a data frame for analysis.
I have a three-dimensional list that comes out of my scrape and I can't figure out how to get it into a data frame. I know I need to make it two-dimensional (249, 4) from the three-dimensional list (1, 249, 4).
table_countryCodes = pd.read_html("https://www.iban.com/country-codes")
np.reshape(table_countryCodes, (249,4))
df_countryCodes = pd.DataFrame(table_countryCodes)
print(df_countryCodes)
Error: ValueError: Must pass 2-d input. shape=(1, 249, 4)
How can I fix this?
Here is a sample of the three-dimensional list from the web scrape for context:
Country Alpha-2 code Alpha-3 code Numeric
American Samoa AS ASM 16
Andorra AD AND 20
Angola AO AGO 24
Anguilla AI AIA 660
CodePudding user response:
pd.read_html
reads all HTML tables into a list of DataFrame objects. Since your use case has only one table in the page, you can extract the same using
df = table_countryCodes[0]
print(df)
which gives us
Country Alpha-2 code Alpha-3 code Numeric
0 Afghanistan AF AFG 4
1 Åland Islands AX ALA 248
2 Albania AL ALB 8
3 Algeria DZ DZA 12
4 American Samoa AS ASM 16
.. ... ... ... ...
244 Wallis and Futuna WF WLF 876
245 Western Sahara EH ESH 732
246 Yemen YE YEM 887
247 Zambia ZM ZMB 894
248 Zimbabwe ZW ZWE 716
[249 rows x 4 columns]
CodePudding user response:
You simply need:
pd.DataFrame(table_countryCodes[0])
i.e. add [0]
to select the first and only element in table_countryCodes
, which has the shape you need.