Home > Blockchain >  How to change the index of a Pandas DataFrame
How to change the index of a Pandas DataFrame

Time:04-22

In the last line of this code I want to set the index of to 'Country' but when I look at the columns of the data frame it is still called 'index'. I have tried without the inplace and create a new df and with option drop=True. But that doesn't to work.

import pandas as pd
import numpy as np
Energy = pd.read_excel('./assets/Energy Indicators.xls', header=None, footer=None, usecols=range(2,6))
Energy = Energy[18:245].reset_index()
Energy.rename(columns={2 : 'Country', 3 :'Energy Supply', 4 : 'Energy Supply per Capita', 5 :  '% Renewable'}, inplace=True)
Energy.replace('...', np.nan, inplace=True)
Energy.replace(["Republic of Korea", "United States of America", "United Kingdom of Great Britain and Northern Ireland", "China, Hong Kong Special Administrative Region"],["South Korea", "United States", "United Kingdom", "Hong Kong"], inplace = True)
Energy['Country'] = Energy['Country'].str.replace(r"\(.*\)","")
Energy['Country'] = Energy['Country'].str.replace('\d ', '',)
Energy['Energy Supply'] = Energy['Energy Supply'].apply(lambda x : x * 1000000)
Energy.set_index('Country', inplace=True)

print(Energy.index)
print(Energy.columns.values)

The output is:

Index(['Afghanistan', 'Albania', 'Algeria', 'American Samoa', 'Andorra',
       'Angola', 'Anguilla', 'Antigua and Barbuda', 'Argentina', 'Armenia',
       ...
       'United States Virgin Islands', 'Uruguay', 'Uzbekistan', 'Vanuatu',
       'Venezuela ', 'Viet Nam', 'Wallis and Futuna Islands', 'Yemen',
       'Zambia', 'Zimbabwe'],
      dtype='object', name='Country', length=227)
['index' 'Energy Supply' 'Energy Supply per Capita' '% Renewable']

How do you set the index?

CodePudding user response:

The 'index' you see in your columns is not your index, it is a column left over from when you did Energy = Energy[18:245].reset_index()

CodePudding user response:

You have done it right!

When you did Energy.set_index('Country', inplace=True), it did work! That's why when you printed the index, Energy.index, it gave you the Countries as the result. Index is a class within Pandas. Read more here

The output of print(Energy.index) also indicates the index to be set as countries.

The next output, print(Energy.columns) shows an index column, because you did a reset_index() previously. Hope this helps!

  • Related