I would like to add another column to a dataframe in which there would be order (lowest to highest) of prices of different fruits per garden category. In this case, it would be:
[2,1,3, 3,2,1, 1,3,2]
Alternatively, I'd want to show e.g. the best 2 fruits (lowest price) per garden category, eg:
garden fruit price
0 A peach 4.359949
1 B peach 0.259262
4 B apple 4.203678
5 C apple 3.303348
6 A plum 2.046486
8 C plum 2.996547
Example of such a dataframe:
import pandas as pd
import numpy as np
np.random.seed(2)
df_fruits = pd.DataFrame({
"fruit": np.repeat(np.asarray(["peach", "apple", "plum"]),3),
"garden": 3*["A", "B", "C"],
"price": 10*np.random.rand(9),
})
df_fruits
fruit garden price
0 peach A 4.359949
1 peach B 0.259262
2 peach C 5.496625
3 apple A 4.353224
4 apple B 4.203678
5 apple C 3.303348
6 plum A 2.046486
7 plum B 6.192710
8 plum C 2.996547
What I tried is for the alternatives to do:
df_fruits.sort_values(by=["fruit", "price"]).groupby('fruit').head(2)
which is fine and what I want to see but I don't know how to go from this to the extra column I want to add with all the orders. The only way how I figured out how to do that is by saving the index, sorting, adding manually column of order and then sorting back. That is fine if the number of items is the same but fails if in some category some fruit was missing. How do I do it to account for that?
The solution is:
tmp = df_fruits.sort_values(by=["fruit", "price"]).reset_index()
tmp["order"] = 3*[1,2,3]
tmp.sort_values(by='index').drop(columns=['index'])
which yields the desired
fruit garden price order
4 peach A 4.359949 2
3 peach B 0.259262 1
5 peach C 5.496625 3
2 apple A 4.353224 3
1 apple B 4.203678 2
0 apple C 3.303348 1
6 plum A 2.046486 1
8 plum B 6.192710 3
7 plum C 2.996547 2
CodePudding user response:
IIUC, you can use groupby.rank
.
df_fruits['order'] = df_fruits.groupby('fruit')['price'].rank()
print(df_fruits)
# fruit garden price order
# 0 peach A 4.359949 2.0
# 1 peach B 0.259262 1.0
# 2 peach C 5.496625 3.0
# 3 apple A 4.353224 3.0
# 4 apple B 4.203678 2.0
# 5 apple C 3.303348 1.0
# 6 plum A 2.046486 1.0
# 7 plum B 6.192710 3.0
# 8 plum C 2.996547 2.0