Home > Blockchain >  Calculate first decile in pandas dataframe
Calculate first decile in pandas dataframe

Time:08-03

I am new to Python and I have the following Pandas dataframe with a column for stock name and another column for stock returns, such as:

data = {'name': ["a","b","c","d","e","f","g","h"], 'value': [1,2,3,4,5,6,7,8]}
pd.DataFrame.from_dict(data)

What I want to do is to add a column in which if the value of a stock name belongs to the first decile out of the whole-sample, then should have value 1, else 0.

I am aware there is the function qcut, but I am not sure how to use it correctly. I tried the following, but I did not get what I was after:

q = data.apply(lambda y: pd.qcut(y, 10,labels=False, duplicates="drop"), axis=0) 

Many thanks in advance.

CodePudding user response:

You can use quantile to compute your decile:

df['≥90%'] = df['value'].ge(df['value'].quantile(0.9)).astype(int)

NB. if you want the lowest decile: df['value'].le(df['value'].quantile(0.1)).astype(int)

output:

  name  value  ≥90%
0    a      1     0
1    b      2     0
2    c      3     0
3    d      4     0
4    e      5     0
5    f      6     0
6    g      7     0
7    h      8     1
  • Related