Home > Software engineering >  "Key length (3) exceeds index depth (2)" with multi-index drop using from_product()
"Key length (3) exceeds index depth (2)" with multi-index drop using from_product()

Time:08-17

                                close  volume
tickerId  timestamp                          
TSLA      2022-08-16 16:06:36  931.50   300.0
          2022-08-16 16:06:37  931.69  1000.0
          2022-08-16 16:06:38  931.69   200.0
          2022-08-16 16:06:39  931.69   200.0
AAPL      2022-08-16 16:06:37  173.17   100.0

Let's say I want to remove 16:06:36, 16:06:37 and 16:06:38 for TSLA, but not for AAPL,

I have this print(timestamp_list):

    [Timestamp('2022-08-16 16:06:36'), Timestamp('2022-08-16 16:06:37'),
Timestamp('2022-08-16 16:06:38')]

When I do this:

multiindex = pd.MultiIndex.from_product([['TSLA'], timestamp_list])

df.drop(pd.MultiIndex.from_product(multiindex),axis=0, inplace=True)

I get the error:

  File "/usr/local/lib/python3.9/site-packages/pandas/core/frame.py", line 4901, in drop
    return super().drop(
  File "/usr/local/lib/python3.9/site-packages/pandas/core/generic.py", line 4150, in drop
    obj = obj._drop_axis(labels, axis, level=level, errors=errors)
  File "/usr/local/lib/python3.9/site-packages/pandas/core/generic.py", line 4185, in _drop_axis
    new_axis = axis.drop(labels, errors=errors)
  File "/usr/local/lib/python3.9/site-packages/pandas/core/indexes/multi.py", line 2201, in drop
    loc = self.get_loc(level_codes)
  File "/usr/local/lib/python3.9/site-packages/pandas/core/indexes/multi.py", line 2927, in get_loc
    raise KeyError(
KeyError: 'Key length (3) exceeds index depth (2)'

But when I modify the dataframe itself, by replacing the timestamp indexes with a non-datetime type, such as integers, the issue disappears.

It seems there is a conlict between the length of the Multi-index and the number of index levels (2)

What did I do wrong?

CodePudding user response:

This works just fine... you don't need to use from_product multiple times~

multiindex = pd.MultiIndex.from_product([['TSLA'], timestamp_list])

df = df.drop(multiindex)
  • Related