I have a dataframe where:
columnA columnB
name timestamp x x
To drop one row in a multiindex dataframe, I have this:
df.drop(my_timestamp, level=1, axis=0, inplace=True)
how can I drop one row with a certain 'name' and 'timestamp' index?
how can I drop multiple rows for one name and a list of timestamps?
CodePudding user response:
While it is typically recommended that each StackOverflow question should be limited to an single issue, these are close enough to being the same, that I will provide my solution for doing what you are looking for:
Given a df like:
A B
Name Date
AA 2018-01-31 -1 52
BB 2018-02-28 0 94
CC 2018-03-31 6 86
DD 2018-04-30 3 50
EE 2018-05-31 11 60
FF 2018-06-30 9 117
GG 2018-07-31 0 45
HH 2018-08-31 -3 62
# Drop a single row
df.drop('AA', level=0, axis=0, inplace=True)
Which removes the Name 'AA' from the dataframe and will in fact remove all 'AA' indexed items
To remove multiple rows you can use:
# Drop several timestamps
df.drop([pd.to_datetime('2018 03 31').date(), pd.to_datetime('2018 07 31').date()], level=1, axis=0, inplace=True)
In the case where you have multiple items indexed at level 0 but you want to remove one or more items from level 2 index you can use the following:
df.drop(('CC', pd.to_datetime('2018 03 31').date()), axis=0, inplace=True)
CodePudding user response:
I m going to provide an answer based on the following dataframe example (you had to provide one actually):
columnA columnB
NameA 2016-01-01 12:00:00 p a
2017-01-01 12:00:00 q b
NameB 2018-01-01 12:00:00 r c
NameC 2019-01-01 12:00:00 s d
how can I drop one row with a certain 'name' and 'timestamp' index?
Lets say you want to drop name with 'NameA' and timestamp with '2017-01-01 12:00:00' then you could use:
df.drop(('NameA', pd.Timestamp(2017, 1, 1, 12)), axis=0)
output:
columnA columnB
NameA 2016-01-01 12:00:00 p a
NameB 2018-01-01 12:00:00 r c
NameC 2019-01-01 12:00:00 s d
how can I drop multiple rows for one name and a list of timestamps?
You can use pd.MultiIndex.from_product
to create a multiindex that you want to drop.
Example: you want to drop the two timestamps that are for 'NameA':
df.drop(
pd.MultiIndex.from_product([
['NameA'],
[pd.Timestamp(2016, 1, 1, 12), pd.Timestamp(2017, 1, 1, 12)]]),
axis=0
)
output:
columnA columnB
NameB 2018-01-01 12:00:00 r c
NameC 2019-01-01 12:00:00 s d