Home > Enterprise >  How to Add another level of column to an existing multi-level column
How to Add another level of column to an existing multi-level column

Time:11-30

I have a data frame that looks like this:

   x   
   A  B
0  0  1
1  2  3
2  4  5
3  6  7
4  8  9

When I want to add another level to the multi-level columns using the following code

x.columns = pd.MultiIndex.from_product([['D'], x.columns])

it gives me the following error

Traceback (most recent call last):
  File "C:\Users\adel.moustafa\DashBoard\main.py", line 262, in <module>
    calculate_yield()
  File "C:\Users\adel.moustafa\DashBoard\main.py", line 204, in calculate_yield
    Analyzer.yield_analyzer_by(yield_data, all_data_df, df_info['P/F Criteria'], 'batch')
  File "C:\Users\adel.moustafa\DashBoard\Modules\Analyzer.py", line 163, in yield_analyzer_by
    x.columns = pd.MultiIndex.from_product([['D'], x.columns])
  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\multi.py", line 621, in from_product
    codes, levels = factorize_from_iterables(iterables)
  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\arrays\categorical.py", line 2881, in factorize_from_iterables
    codes, categories = zip(*(factorize_from_iterable(it) for it in iterables))
  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\arrays\categorical.py", line 2881, in <genexpr>
    codes, categories = zip(*(factorize_from_iterable(it) for it in iterables))
  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\arrays\categorical.py", line 2854, in factorize_from_iterable
    cat = Categorical(values, ordered=False)
  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\arrays\categorical.py", line 451, in __init__
    dtype = CategoricalDtype(categories, dtype.ordered)
  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\dtypes\dtypes.py", line 183, in __init__
    self._finalize(categories, ordered, fastpath=False)
  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\dtypes\dtypes.py", line 337, in _finalize
    categories = self.validate_categories(categories, fastpath=fastpath)
  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\dtypes\dtypes.py", line 530, in validate_categories
    if categories.hasnans:
  File "pandas\_libs\properties.pyx", line 37, in pandas._libs.properties.CachedProperty.__get__
  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 2681, in hasnans
    return bool(self._isnan.any())
  File "pandas\_libs\properties.pyx", line 37, in pandas._libs.properties.CachedProperty.__get__
  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 2666, in _isnan
    return isna(self)
  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\dtypes\missing.py", line 144, in isna
    return _isna(obj)
  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\dtypes\missing.py", line 169, in _isna
    raise NotImplementedError("isna is not defined for MultiIndex")
NotImplementedError: isna is not defined for MultiIndex

I have checked that there is no Na values in my column nor its values, I have also looked at this post and this post and finally this but with no results

here is reproducible code

import pandas as pd
import numpy as np
x = pd.DataFrame(np.arange(10).reshape(5, 2), columns=pd.MultiIndex.from_product([['x'], ['A', 'B']]))
x.columns = pd.MultiIndex.from_product([['D'], x.columns])

can any one point to what is wrong and how fix it?

CodePudding user response:

You need to do this:

 x.columns = pd.MultiIndex.from_product([['D'], *x.columns.levels])

where x.columns.levels gives you a Frozenlist of columns that form the MultiIndex.

And then you have to unpack the list using * in order to pass list of lists to from_product.

CodePudding user response:

You can denote a multi-level column with a tuple. For instance:

x[('y', 'C')] = x[('x', 'A')]   1
x[('x', 'D')] = 0

>>> x
   x     y  x
   A  B  C  D
0  0  1  1  0
1  2  3  3  0
2  4  5  5  0
3  6  7  7  0
4  8  9  9  0

And, of course, you can sort the columns:

x = x.sort_index(axis=1)

>>> x
   x        y
   A  B  D  C
0  0  1  0  1
1  2  3  0  3
2  4  5  0  5
3  6  7  0  7
4  8  9  0  9
  • Related