I would like to replace pandas multi index columns with uppercase names. With a normal (1D/level) index, I would do something like
df.coulumns = [c.upper() for c in df.columns]
When this is done on a DataFrame with a pd.MultiIndex, I get the following error:
AttributeError: 'tuple' object has no attribute 'upper'
How would I apply the same logic to a pandas multi index? Example code is below.
import pandas as pd
import numpy as np
arrays = [
["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"],
["one", "two", "one", "two", "one", "two", "one", "two"],
]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=["first", "second"])
df = pd.DataFrame(np.random.randn(3, 8), index=["A", "B", "C"], columns=index)
arrays_upper = [
["BAR", "BAR", "BAZ", "BAZ", "FOO", "FOO", "QUX", "QUX"],
["ONE", "TWO", "ONE", "TWO", "ONE", "TWO", "ONE", "TWO"],
]
tuples_upper = list(zip(*arrays_upper))
index_upper = pd.MultiIndex.from_tuples(tuples_upper, names=['first', 'second'])
df_upper = pd.DataFrame(np.random.randn(3, 8), index=["A", "B", "C"], columns=index_upper)
print(f'Have: {df.columns}')
print(f'Want: {df_upper.columns}')
CodePudding user response:
You can convert the multiindex to dataframe and uppercase the value in dataframe then convert it back to multiindex
df.columns = pd.MultiIndex.from_frame(df.columns.to_frame().applymap(str.upper))
print(df)
first BAR BAZ FOO QUX
second ONE TWO ONE TWO ONE TWO ONE TWO
A -0.374874 0.049597 -1.930723 -0.279234 0.235430 0.351351 -0.263074 -0.068096
B 0.040872 0.969948 -0.048848 -0.610735 -0.949685 0.336952 -0.012458 -0.258237
C 0.932494 -1.655863 0.900461 0.403524 -0.123720 0.207627 -0.372031 -0.049706
Or follow your loop idea
df.columns = pd.MultiIndex.from_tuples([tuple(map(str.upper, c)) for c in df.columns])
CodePudding user response:
Use set_levels
:
df.columns = df.columns.set_levels([level.str.upper() for level in df.columns.levels])