Home > Software engineering >  Convert pandas MultiIndex columns to uppercase
Convert pandas MultiIndex columns to uppercase

Time:09-10

I would like to replace pandas multi index columns with uppercase names. With a normal (1D/level) index, I would do something like

df.coulumns = [c.upper() for c in df.columns]

When this is done on a DataFrame with a pd.MultiIndex, I get the following error:

AttributeError: 'tuple' object has no attribute 'upper'

How would I apply the same logic to a pandas multi index? Example code is below.

import pandas as pd
import numpy as np

arrays = [
    ["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"],
    ["one", "two", "one", "two", "one", "two", "one", "two"],
]

tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=["first", "second"])
df = pd.DataFrame(np.random.randn(3, 8), index=["A", "B", "C"], columns=index)

arrays_upper = [
    ["BAR", "BAR", "BAZ", "BAZ", "FOO", "FOO", "QUX", "QUX"],
    ["ONE", "TWO", "ONE", "TWO", "ONE", "TWO", "ONE", "TWO"],
]

tuples_upper = list(zip(*arrays_upper))
index_upper = pd.MultiIndex.from_tuples(tuples_upper, names=['first', 'second'])
df_upper = pd.DataFrame(np.random.randn(3, 8), index=["A", "B", "C"], columns=index_upper)

print(f'Have: {df.columns}')
print(f'Want: {df_upper.columns}')

CodePudding user response:

You can convert the multiindex to dataframe and uppercase the value in dataframe then convert it back to multiindex

df.columns = pd.MultiIndex.from_frame(df.columns.to_frame().applymap(str.upper))
print(df)

first        BAR                 BAZ                 FOO                 QUX
second       ONE       TWO       ONE       TWO       ONE       TWO       ONE       TWO
A      -0.374874  0.049597 -1.930723 -0.279234  0.235430  0.351351 -0.263074 -0.068096
B       0.040872  0.969948 -0.048848 -0.610735 -0.949685  0.336952 -0.012458 -0.258237
C       0.932494 -1.655863  0.900461  0.403524 -0.123720  0.207627 -0.372031 -0.049706

Or follow your loop idea

df.columns = pd.MultiIndex.from_tuples([tuple(map(str.upper, c)) for c in df.columns])

CodePudding user response:

Use set_levels:

df.columns = df.columns.set_levels([level.str.upper() for level in df.columns.levels])
  • Related