Home > Back-end >  Iterate through column in a dataframe and count the number of any given result between two values
Iterate through column in a dataframe and count the number of any given result between two values

Time:08-29

I'm looking to iterate through a dataframe column and count the number of results between any two values. For example - if I want to count the results between 1-5 for each result in column A, I may start with the dataframe:

Column A Column B
A 2
A 3
B 2
B 16
A 4
B 22

And I'm looking for an output that counts the number of results between two amounts - for instance:

Column A Results Between 1-5
A 3
B 1

I've tried a number of different variations of both trying to break it down into individual steps and using a for or a for x, result in Column A variations but I just can't get it to work.

Any guidance or thoughts would be much appreciated!

CodePudding user response:

You could achieve this as follows. Select rows with values between 1 and 5 by combining df.lt and df.gt. Next, perform a df.groupby and benefit from pd.NamedAgg to have control over the output column name (i.e. "Results Between 1-5") as we apply count.

result = df.loc[df['Column B'].gt(1) & df['Column B'].lt(5)]\
    .groupby('Column A', as_index=False).agg(
        **{"Results Between 1-5": pd.NamedAgg(column='Column B', 
                                              aggfunc='count')
           })

print(result)

  Column A  Results Between 1-5
0        A                    3
1        B                    1

N.B. This one liner achieves the same as:

result = df.loc[df['Column B'].gt(1) & df['Column B'].lt(5)].groupby('Column A', as_index=False)['Column B'].count()
result.rename(columns={'Column B':'Results Between 1-5'}, inplace=True)

CodePudding user response:

You can first filter the dataframe and apply count:

filtered_df = df[(df['Column B'] < 5) & (df['Column B'] > 1)]
filtered_df.groupby('Column A')['Column B'].count().reset_index().rename(columns={'Column B': 'Results Between 1-5'})
  • Related