Home > OS >  How to drop multiple rows with certain 1st level and 2nd level index?
How to drop multiple rows with certain 1st level and 2nd level index?

Time:08-16

I have a dataframe where:

                 columnA  columnB
name timestamp      x        x

To drop one row in a multiindex dataframe, I have this:

df.drop(my_timestamp, level=1, axis=0, inplace=True)
  1. how can I drop one row with a certain 'name' and 'timestamp' index?

  2. how can I drop multiple rows for one name and a list of timestamps?

CodePudding user response:

While it is typically recommended that each StackOverflow question should be limited to an single issue, these are close enough to being the same, that I will provide my solution for doing what you are looking for:

Given a df like:

                A   B
Name    Date        
AA  2018-01-31  -1  52
BB  2018-02-28  0   94
CC  2018-03-31  6   86
DD  2018-04-30  3   50
EE  2018-05-31  11  60
FF  2018-06-30  9   117
GG  2018-07-31  0   45
HH  2018-08-31  -3  62  

# Drop a single row
df.drop('AA', level=0, axis=0, inplace=True)  

Which removes the Name 'AA' from the dataframe and will in fact remove all 'AA' indexed items

To remove multiple rows you can use:

# Drop several timestamps

    df.drop([pd.to_datetime('2018 03 31').date(), pd.to_datetime('2018 07 31').date()], level=1, axis=0, inplace=True)  

In the case where you have multiple items indexed at level 0 but you want to remove one or more items from level 2 index you can use the following:

df.drop(('CC', pd.to_datetime('2018 03 31').date()), axis=0, inplace=True)

CodePudding user response:

I m going to provide an answer based on the following dataframe example (you had to provide one actually):

                          columnA columnB
NameA 2016-01-01 12:00:00       p       a
      2017-01-01 12:00:00       q       b
NameB 2018-01-01 12:00:00       r       c
NameC 2019-01-01 12:00:00       s       d

how can I drop one row with a certain 'name' and 'timestamp' index?

Lets say you want to drop name with 'NameA' and timestamp with '2017-01-01 12:00:00' then you could use:

df.drop(('NameA', pd.Timestamp(2017, 1, 1, 12)), axis=0)

output:

                          columnA columnB
NameA 2016-01-01 12:00:00       p       a
NameB 2018-01-01 12:00:00       r       c
NameC 2019-01-01 12:00:00       s       d

how can I drop multiple rows for one name and a list of timestamps?

You can use pd.MultiIndex.from_product to create a multiindex that you want to drop.

Example: you want to drop the two timestamps that are for 'NameA':

df.drop(
    pd.MultiIndex.from_product([
        ['NameA'], 
        [pd.Timestamp(2016, 1, 1, 12), pd.Timestamp(2017, 1, 1, 12)]]),
    axis=0
)

output:

                          columnA columnB
NameB 2018-01-01 12:00:00       r       c
NameC 2019-01-01 12:00:00       s       d
  • Related