I am trying unsuccessfully to rename a dataframe's columns. The column names are tuples which I am trying to convert to concatenated strings.
import requests
import pandas as pd
url = 'https://en.wikipedia.org/wiki/List_of_countries_by_length_of_coastline'
html = requests.get(url).content
df_list = pd.read_html(html)
coast_df1 = df_list[-2]
print(coast_df1.columns)
rename_dict = {(a,b): (a '-' b) for a,b in coast_df1.columns}
print(rename_dict)
coast_df2 = coast_df1.rename(columns = rename_dict)
print(coast_df2.columns)
The list comprehension on the rename_dict is working as expected but the rename fails to persist.
1. Output:
MultiIndex([( 'Country', 'Country'),
( 'The World Factbook[2]', 'Rank'),
( 'The World Factbook[2]', 'km'),
('World Resources Institute[1]', 'Rank'),
('World Resources Institute[1]', 'km'),
( 'Land area km2 (TWF)[3]', 'Land area km2 (TWF)[3]'),
( 'Coast/area ratio (m/km2)', '(TWF)'),
( 'Coast/area ratio (m/km2)', '(WRI)'),
( 'Coast/area0.5 ratio', '(TWF)')],
)
{('Country', 'Country'): 'Country-Country', ('The World Factbook[2]', 'Rank'): 'The World Factbook[2]-Rank', ('The World Factbook[2]', 'km'): 'The World Factbook[2]-km', ('World Resources Institute[1]', 'Rank'): 'World Resources Institute[1]-Rank', ('World Resources Institute[1]', 'km'): 'World Resources Institute[1]-km', ('Land area km2 (TWF)[3]', 'Land area km2 (TWF)[3]'): 'Land area km2 (TWF)[3]-Land area km2 (TWF)[3]', ('Coast/area ratio (m/km2)', '(TWF)'): 'Coast/area ratio (m/km2)-(TWF)', ('Coast/area ratio (m/km2)', '(WRI)'): 'Coast/area ratio (m/km2)-(WRI)', ('Coast/area0.5 ratio', '(TWF)'): 'Coast/area0.5 ratio-(TWF)'}
MultiIndex([( 'Country', 'Country'),
( 'The World Factbook[2]', 'Rank'),
( 'The World Factbook[2]', 'km'),
('World Resources Institute[1]', 'Rank'),
('World Resources Institute[1]', 'km'),
( 'Land area km2 (TWF)[3]', 'Land area km2 (TWF)[3]'),
( 'Coast/area ratio (m/km2)', '(TWF)'),
( 'Coast/area ratio (m/km2)', '(WRI)'),
( 'Coast/area0.5 ratio', '(TWF)')],)
CodePudding user response:
See Rename MultiIndex columns in Pandas. In order for coast_df2 = coast_df1.rename(columns = rename_dict)
to work as desired, you first need to set the columns of your first df equal to the values of its columns. Try:
coast_df1.columns = coast_df1.columns.values
coast_df2 = coast_df1.rename(columns = rename_dict)
Alternatively, you could have made a list of the concatenated strings instead of a dictionary, and used that list to overwrite the columns of the copied df:
rename_list = ['-'.join(x) for x in coast_df1.columns]
coast_df2 = coast_df1.copy()
coast_df2.columns = rename_list
CodePudding user response:
coast_df2 = coast_df1.copy()
coast_df2.columns = coast_df2.columns.map('-'.join)