Home > OS >  Pandas MultiIndex Columns fails to rename
Pandas MultiIndex Columns fails to rename

Time:06-11

I am trying unsuccessfully to rename a dataframe's columns. The column names are tuples which I am trying to convert to concatenated strings.

import requests
import pandas as pd

url = 'https://en.wikipedia.org/wiki/List_of_countries_by_length_of_coastline'
html = requests.get(url).content
df_list = pd.read_html(html)
coast_df1 = df_list[-2]
print(coast_df1.columns)

rename_dict = {(a,b): (a  '-'   b) for a,b in coast_df1.columns}
print(rename_dict)

coast_df2 = coast_df1.rename(columns = rename_dict)
print(coast_df2.columns)

The list comprehension on the rename_dict is working as expected but the rename fails to persist.

 1. Output:

MultiIndex([(                     'Country',                'Country'),
            (       'The World Factbook[2]',                   'Rank'),
            (       'The World Factbook[2]',                     'km'),
            ('World Resources Institute[1]',                   'Rank'),
            ('World Resources Institute[1]',                     'km'),
            (      'Land area km2 (TWF)[3]', 'Land area km2 (TWF)[3]'),
            (    'Coast/area ratio (m/km2)',                  '(TWF)'),
            (    'Coast/area ratio (m/km2)',                  '(WRI)'),
            (         'Coast/area0.5 ratio',                  '(TWF)')],
           )

{('Country', 'Country'): 'Country-Country', ('The World Factbook[2]', 'Rank'): 'The World Factbook[2]-Rank', ('The World Factbook[2]', 'km'): 'The World Factbook[2]-km', ('World Resources Institute[1]', 'Rank'): 'World Resources Institute[1]-Rank', ('World Resources Institute[1]', 'km'): 'World Resources Institute[1]-km', ('Land area km2 (TWF)[3]', 'Land area km2 (TWF)[3]'): 'Land area km2 (TWF)[3]-Land area km2 (TWF)[3]', ('Coast/area ratio (m/km2)', '(TWF)'): 'Coast/area ratio (m/km2)-(TWF)', ('Coast/area ratio (m/km2)', '(WRI)'): 'Coast/area ratio (m/km2)-(WRI)', ('Coast/area0.5 ratio', '(TWF)'): 'Coast/area0.5 ratio-(TWF)'}

MultiIndex([(                     'Country',                'Country'),
            (       'The World Factbook[2]',                   'Rank'),
            (       'The World Factbook[2]',                     'km'),
            ('World Resources Institute[1]',                   'Rank'),
            ('World Resources Institute[1]',                     'km'),
            (      'Land area km2 (TWF)[3]', 'Land area km2 (TWF)[3]'),
            (    'Coast/area ratio (m/km2)',                  '(TWF)'),
            (    'Coast/area ratio (m/km2)',                  '(WRI)'),
            (         'Coast/area0.5 ratio',                  '(TWF)')],)

CodePudding user response:

See Rename MultiIndex columns in Pandas. In order for coast_df2 = coast_df1.rename(columns = rename_dict) to work as desired, you first need to set the columns of your first df equal to the values of its columns. Try:

coast_df1.columns = coast_df1.columns.values
coast_df2 = coast_df1.rename(columns = rename_dict)

Alternatively, you could have made a list of the concatenated strings instead of a dictionary, and used that list to overwrite the columns of the copied df:

rename_list = ['-'.join(x) for x in coast_df1.columns]
coast_df2 = coast_df1.copy()
coast_df2.columns = rename_list

CodePudding user response:

From here:

coast_df2 = coast_df1.copy()
coast_df2.columns = coast_df2.columns.map('-'.join)
  • Related