I am working with a pandas
dataframe with multi-index columns (two levels). I need to drop a column from level 0 and later get a list of the remaining columns in level=0. Strangely, the dropping part works fine, but somehow the dropped column shows back up if you call df.columns.levels[0]
.
Here's a MRE. When I call df.columns
the result is this:
MultiIndex([('Week2', 'Hours'), ('Week2', 'Sales')], )
Which sure looks like Week1
is gone. But if I call df.columns.levels[0].tolist()
...
['Week1', 'Week2']
Here's the full code:
import pandas as pd
import numpy as np
n = ['Mickey', 'Minnie', 'Snow White', 'Donald',
'Goofy', 'Elsa', 'Pluto', 'Daisy', 'Mad Hatter']
df1 = pd.DataFrame(data={'Hours': [32, 30, 34, 33, 22, 12, 19, 17, 9],
'Sales': [10, 15, 12, 15, 6, 11, 9, 7, 4]},
index=n)
df2 = pd.DataFrame(data={'Hours': [40, 33, 29, 31, 17, 22, 13, 16, 12],
'Sales': [12, 14, 8, 16, 3, 12, 5, 6, 4]},
index=n)
df1.columns = pd.MultiIndex.from_product([['Week1'], df1.columns])
df2.columns = pd.MultiIndex.from_product([['Week2'], df2.columns])
df = pd.concat([df1, df2], axis=1)
#I want to remove a whole level
df = df.drop('Week1', axis=1, level=0)
print(df.columns) #Seems successful after this print, only Week 2 is left
print(df.columns.levels[0].tolist()) #This list includes Week1!
CodePudding user response:
Use remove_unused_levels:
From the documentation:
Unused level(s) means levels that are not expressed in the labels. The resulting MultiIndex will have the same outward appearance, meaning the same .values and ordering. It will also be .equals() to the original.
df.columns = df.columns.remove_unused_levels()
print(df.columns.levels)
# Output
[['Week2'], ['Hours', 'Sales']]