Small sample of data:
df = pd.DataFrame({'Customer_ID': [1,1,1,1, 2,2,2,2], 'Customer Age' : [15,15,15,15, 50,50,50,50], 'Similar Customer ID': [5, 8, 11, 15, 22, 28, 31, 11], 'Age': [40, 43, 37, 11, 29, 42, 14, 33],
})
Question
How do I add a column that rank the customer with related customers, i.e., Customer 1 is 15 years old, I want the desired output to rank that customer as second youngest compared to 'Similar Customer':
df = pd.DataFrame({'Customer_ID': [1,1,1,1, 2,2,2,2], 'Customer Age' : [15,15,15,15, 50,50,50,50], 'Similar Customer ID': [5, 8, 11, 15, 22, 28, 31, 11], 'Age': [40, 43, 37, 11, 29, 42, 14, 33], 'Age Rank': [2, 2, 2, 2, 5,5,5,5]
})
CodePudding user response:
You can just compare the Customer Age
to Age
then take the sum (over each customer):
df['Age Rank'] = (df['Customer Age'].ge(df['Age'])
.groupby(df['Customer_ID'])
.transform('sum') 1
)
print(df)
Output:
Customer_ID Customer Age Similar Customer ID Age Age Rank
0 1 15 5 40 2
1 1 15 8 43 2
2 1 15 11 37 2
3 1 15 15 11 2
4 2 50 22 29 5
5 2 50 28 42 5
6 2 50 31 14 5
7 2 50 11 33 5