I have a data frame with four named indices, time, lev, lon, and lat, like this (this is just the head, it's a huge dataframe):
O N
time lev lat lon
2021-01-01 4.055141e-10 -90.0 0.0 0.954735 0.046307
2.5 0.954735 0.046307
5.0 0.954735 0.046307
7.5 0.954735 0.046307
10.0 0.954735 0.046307
12.5 0.954735 0.046307
15.0 0.954735 0.046307
17.5 0.954735 0.046307
20.0 0.954735 0.046307
22.5 0.954735 0.046307
I would like to omit all data where lev < 1. If lev were a column, I could do this just by:
df = df[df['lev'] > 1]
but lev is an idnex, rather than a column. In theory, I could use
df.reset_index(level=['lev'])
to turn the index into a column, but my dataframe is too large for that and it always crashes. So how I can index by the index?
CodePudding user response:
You can use Index.get_level_values
:
df = df[df.index.get_level_values('lev') > 1]
Or with query
(provided there is no column with the same name):
df = df.query('lev > 1')
Example with a different condition to get a non-empty output:
df[df.index.get_level_values('lon') > 17]
output:
O N
time lev lat lon
2021-01-01 4.055141e-10 -90.0 17.5 0.954735 0.046307
20.0 0.954735 0.046307
22.5 0.954735 0.046307