Home > OS >  Python Pandas Multi Index - Count the number of occurrences greater than or equal to in tail
Python Pandas Multi Index - Count the number of occurrences greater than or equal to in tail

Time:01-03

I have a multi index dataframe that looks like this:

enter image description here

For each item, I would like to count the number of times in the last two periods the qty has been greater than or equal to 18. The answer is written to a new df that has the item number and number of occurrences such as:

enter image description here

This is what I tried to do:

import pandas as pd
df = pd.DataFrame({'item':[1,1,1,2,2,3],'date':['2020-03-31','2020-06-30','2020-09-30','2021-04-15','2021-06-30','2021-06-30',],'warehouse':[10,11,12,13,14,15], 'qty':[16,17,18,19,20,21]}).set_index(['item','date'])
df

new_df = []
new_df['occurs'] = df['qty'].tail(2).ge(2).sum()
new_df

This produces an error:

TypeError: list indices must be integers or slices, not str

Can someone point me in the right direction?

CodePudding user response:

not sure why 3 gets NaN in your example, but this can be done by groupby and apply functions:

new_df = df.groupby(level=0).apply(lambda x:(x.tail(2).qty>=18).sum())
new_df
1    1
2    2
3    1
  • Related