Home > Mobile >  Quartiles calculations and classifications filtering by product
Quartiles calculations and classifications filtering by product

Time:09-30

I am having a hard time to get this done:

What I have: pandas dataframe:

product   seller   price
   A        Yo      10
   A        Ka      5
   A        Poy     7.5
   A        Nyu     2.5
   A        Poh     1.25
   B        Poh     11.25

What I want:

given a df like the one above product, seller, price I wan to calculate 4 quartiles based on price's column for that particulary product and classify the price of each seller of that product into these quartiles.

When all prices are the same, the 4 quartiles has the same value and the price is classified as 1st quartile

Expected Outuput:

product   seller   price   Quartile   1Q     2Q      3Q      4Q
   A        Yo      10        4       2.5    5       7.5     10
   A        Ka      5         2       2.5    5       7.5     10
   A        Poy     7.5       3       2.5    5       7.5     10
   A        Nyu     2.5       1       2.5    5       7.5     10
   A        Poh     1.25      1       2.5    5       7.5     10
   B        Poh     11.25     1       11.25  11.25  11.25    11.25

What I did so far:

if I use: df['Price'].quantile([0.25,0.5,0.75,1]) it will claculate 4 quartiles of all prices without filter by product, so its wrong.

I am lost because I dont know how to do this in python.
Can anyone give me some light here? Thanks

CodePudding user response:

Try:

dfQuantile = df.groupby("product")['Price'].quantile([0.25,0.5,0.75,1]).unstack().reset_index().rename(columns={0.25:"1Q",0.5:"2Q",0.75:"3Q",1:"4Q"})
out = pd.merge(df,dfQuantile,on="product",how="left")
out["Quantile"] = df.groupby(['product'])['Price'].transform(
                     lambda x: pd.qcut(x, 4, labels=False, duplicates="drop")).fillna(0).add(1)

print(out)


  product   seller  Price   Quantile    1Q       2Q      3Q      4Q
0   A        Yo     10.00   4           2.50    5.00    7.50    10.00
1   A        Ka     5.00    2           2.50    5.00    7.50    10.00
2   A        Poy    7.50    3           2.50    5.00    7.50    10.00
3   A        Nyu    2.50    1           2.50    5.00    7.50    10.00
4   A        Poh    1.25    1           2.50    5.00    7.50    10.00
5   B        Poh    11.25   1           11.25   11.25   11.25   11.25
  • Related