Home > database >  python dataframe number of last consequence rows less than current
python dataframe number of last consequence rows less than current

Time:01-19

I need to set number of last consequence rows less than current.

Below is a sample input and the result.

df = pd.DataFrame([10,9,8,11,10,13], columns=['value'])


df_result = pd.DataFrame([[10, 0], [9, 0], [8, 0], [11, 3], [10, 0], [13, 5]], columns=['value', 'number of last consequence rows less than current'])

Is it possible to achieve this without loop?

Otherwise solution with loop would be good.

CodePudding user response:

Assuming this input:

   value
0     10
1      9
2      8
3     11
4     10
5     13

You can use a cummax and expanding custom function:

df['out'] = (df['value'].cummax().expanding()
             .apply(lambda s: s.lt(df.loc[s.index[-1], 'value']).sum())
            )

For the particular case of < comparison, you can use a much faster trick with numpy. If a value is greater than all previous values, then it is greater than n values where n is the rank:

m = df['value'].lt(df['value'].cummax())
df['out'] = np.where(m, 0, np.arange(len(df)))

Output:

   value  out
0     10  0.0
1      9  0.0
2      8  0.0
3     11  3.0
4     10  0.0
5     13  5.0
  • Related