Home > Software engineering >  Remove entries of pandas multiindex if function returns false
Remove entries of pandas multiindex if function returns false

Time:01-24

I have a function that receives a whole entry of a multiindex that returns true if or false for the entire index. Hereby I am feeding several columns of the entry as a key value pair e.g.:

temp = cells.loc[0]
x = temp.set_index(['eta','phi'])['e'].to_dict()
filter_frame(x,20000) # drop event if this function returns false

So far I only found examples where people want to remove single rows but I am talking an entire entry with several hundred subentries, as all subentries are used to output the boolean. How can I drop entries that dont fulfill this condition?

Edit: Data sample enter image description here

The filter_frame() function would just produce a true or false for this entry 0, which contains 780 rows. The function also works fine, I just dont know how to apply it without doing slow for loops. What I am looking for is something like this

cells = cells[apply the filter function somehow for all entries] 

and have a significantly smaller dataframe

Edit2: print(mask) of jezraels solution: enter image description here

CodePudding user response:

Frst call function per first level of MultiIndex in GroupBy.apply - get mask per groups, so for filtering original DataFrame use MultiIndex.droplevel for remove second level with mapping by Index.map, so possible filtering in boolean indexing:

def f(temp):
    x = temp.set_index(['eta','phi'])['e'].to_dict()
    return filter_frame(x,20000)

mask = cells.index.droplevel(1).map(cells.groupby(level=0).apply(f))

out = cells[mask]
  • Related