Home > Enterprise >  python scaling and scoring a dataframe with upper and lower limit
python scaling and scoring a dataframe with upper and lower limit

Time:09-16

I have a df['values'] column which I would like to score between 0 and 1. The range is scored with 2 seperate upper and lower limts.

  • lower/upper limit of 20/0
  • lower/upper limit of 0/30

Is there a function on python for this operation? minmaxscalar does not allow me to set upper and lower bounds.

Input is df['values']
Desired output is df['score']

values   score(20/0)  score(0/30)
-5.1     1.00         0.00
3.6      0.82         0.12
6.6      0.67         0.22
9.0      0.55         0.30
21.0     0.00         0.70

CodePudding user response:

You can scale to low/up first, then clip:

df = pd.DataFrame({'values': [-5.1, 3.6, 6.6, 9, 21]})
MIN = 20
MAX = 0
df['values'].sub(MIN).div(MAX-MIN).clip(0, 1)

Output:

0    1.00
1    0.82
2    0.67
3    0.55
4    0.00
Name: values, dtype: float64

as a function

def score(df, MIN, MAX):
    return (df['values']
              .sub(MIN)
              .div(MAX-MIN)
              .clip(0, 1)
              .rename(f'score({MIN},{MAX})')
            )

pd.concat([df,
           score(df, 20, 0),
           score(df, 0, 30)],
          axis=1)

Output:

   values  score(20,0)  score(0,30)
0    -5.1         1.00         0.00
1     3.6         0.82         0.12
2     6.6         0.67         0.22
3     9.0         0.55         0.30
4    21.0         0.00         0.70
  • Related