Home > Back-end >  define range in pandas column based on define input from list
define range in pandas column based on define input from list

Time:11-11

I have one data frame, wherein I need to apply range in one column, based on the list provided, I am able to achieve results using fixed values but input values will be dynamic in a list format and the range will be based on input.

MY Data frame looks like below:

import pandas as pd
rangelist=[90,70,50]
data = {'Result': [75,85,95,45,76,8,10,44,22,65,35,67]}  
sampledf=pd.DataFrame(data)

range list is my list, from that I need to create range like 100-90,90-70 & 70-50. These ranges may differ from time to time, till now I am achieving results using the below function.

def cat(value):
    cat=''
    if (value>90):
        cat='90-100'
    if (value<90 and value>70 ):
        cat='90-70'

    else: 
        cat='< 50'
    return cat

sampledf['category']=sampledf['Result'].apply(cat)

How can I pass dynamic value in function"cat" based on the range list? I will be grateful if someone can help me to achieve the below result.

Result  category
0   75  90-70
1   85  90-70
2   95  < 50
3   45  < 50
4   76  90-70
5   8   < 50
6   10  < 50
7   44  < 50
8   22  < 50
9   65  < 50
10  35  < 50
11  67  < 50

CodePudding user response:

I would recommend pd.cut for this:

sampledf['Category'] = pd.cut(sampledf['Result'], 
                              [-np.inf]   sorted(rangelist)   [np.inf])

Output:

    Result      Category
0       75  (70.0, 90.0]
1       85  (70.0, 90.0]
2       95   (90.0, inf]
3       45  (-inf, 50.0]
4       76  (70.0, 90.0]
5        8  (-inf, 50.0]
6       10  (-inf, 50.0]
7       44  (-inf, 50.0]
8       22  (-inf, 50.0]
9       65  (50.0, 70.0]
10      35  (-inf, 50.0]
11      67  (50.0, 70.0]

CodePudding user response:

import numpy as np

breaks = pd.Series([100, 90, 75, 50, 45, 20, 0])
sampledf["ind"] = sampledf.Result.apply(lambda x: np.where(x >= breaks)[0][0])
sampledf["category"] = sampledf.ind.apply(lambda i: (breaks[i], breaks[i-1]))
sampledf
#     Result  ind   category
# 0       75    2   (75, 90)
# 1       85    2   (75, 90)
# 2       95    1  (90, 100)
# 3       45    4   (45, 50)
# 4       76    2   (75, 90)
# 5        8    6    (0, 20)
# 6       10    6    (0, 20)
# 7       44    5   (20, 45)
# 8       22    5   (20, 45)
# 9       65    3   (50, 75)
# 10      35    5   (20, 45)
# 11      67    3   (50, 75)
  • Related