I am trying to count how many rows meet a given condition, but I would like this condition to change after each recalculation. I have sample data:
data={'ID':['1','2','3','4'],'X':[23,22,21,24],'Y':[44,45,41,46],'X_MIN':[22,21,20,23],'Y_MIN':[43,44,40,45]}
ID | X | Y | X_MIN | Y_MIN |
---|---|---|---|---|
1 | 23 | 44 | 22 | 43 |
2 | 22 | 45 | 21 | 44 |
3 | 21 | 41 | 20 | 40 |
4 | 24 | 46 | 23 | 45 |
This is the id of the store and its coordinates. I would like to check how many stores are below a single store. So I have to assign the minimum X and Y of each store to the entire database and check where X>=X_MIN and Y>=Y_MIN. I've tried to make a loop which would create a new data frame and replace all X_MIN and Y_MIN with constant values from the first line of my data. Then I wanted to filter how many cases meet the condition, count them and add them to the list. And in this way repeat the action by iterating over the rows. What I've done is below:
test=df
calc=[]
for i, row in df.iterrows():
test['X_MIN']=row['X_MIN']
test['Y_MIN']=row['Y_MIN']
count=test[['X','Y','X_MIN','Y_MIN']].query('X>=X_MIN and Y>=Y_MIN')
a=count['X'].count()
calc.append(a)
I would like to come to an example table:
final={'ID':['1','2','3','4'],'count':[3,3,4,1]}
ID | count |
---|---|
1 | 3 |
2 | 3 |
3 | 4 |
4 | 1 |
However, the command counts endlessly. I am asking for help in solving this problem.
CodePudding user response:
Maybe this code is what you're looking for. It generate a column count
in which you have, for each ID
, the number of other IDs that have both X
and Y
larger or equal than itself
import pandas as pd
data= pd.DataFrame({'ID':['1','2','3','4'],'X':[23,22,21,24],'Y':[44,45,41,46]})
data['count'] = data.apply(lambda x: len(data[(data['X'] >= x['X'])&(data['Y'] >= x['Y'])]),axis=1)