Pandas: count how many rows between two values in a column-CodePudding

Let say I have the following dataframe

import pandas as pd
df = pd.DataFrame({
'Est': [1.18,1.83,2.08,2.30,2.45,3.21,3.26,3.54,3.87,4.58,4.59,4.98],
'Buy': [0,1,1,1,0,1,1,0,1,0,0,1]
})


    Est     Buy
0   1.18    0
1   1.83    1
2   2.08    1
3   2.30    1
4   2.45    0
5   3.21    1
6   3.26    1
7   3.54    0
8   3.87    1
9   4.58    0
10  4.59    0
11  4.98    1

I will like to create a new dataframe with two columns and 4 rows with the following format: the first row contains how many 'Est' values are between 1 and 2, and how many 1's in the column 'Buy'; the second row the same for the 'Est' values between 2 and 3; third row between 3 and 4, and so on. So my output should be

I tried to use the where clause in pandas (or np.where) to create new columns with restrictions like df['Est'] >= 1 & df['Est'] <= 2 and then count. But, is there an easier and cleaner way to do this? Thanks

CodePudding user response：

Sounds like you want to group by the floor of the first column:

g = df.groupby(df['Est'] // 1)

You count the Est column:

count = g['Est'].count()

And sum the Buy column:

buys = g['Buy'].sum()