Home > Back-end >  Normalize the number of rows based on min and max of that rows in dataframe
Normalize the number of rows based on min and max of that rows in dataframe

Time:02-10

I have a data frame and I want to normalize each number based on the minimum of that row and the maximum of that row based on this formulation.

  x_normalized = (x_unnormalized-x_min)/(x_max-x_min). 

I've check the scikit-learn package and I could not find any function for that. Could you help me with this? I also provide a sample as follows and what I want.

import pandas as pd
import numpy as np


df = pd.DataFrame()
df['id'] = [a, b, c]
df['c1'] = [2, 5, 3]
df['c2'] = [0, 5, 6]
df['c3'] = [8, 7, 9]

print(df)
#here is the dataframe which i want
df = pd.DataFrame()
df['id'] = [a, b, c]
df['c1'] = [1/4, 0, 0]
df['c2'] = [0, 0, 0.5]
df['c3'] = [1, 1, 1]
df

CodePudding user response:

It looks like there is a typo in your output.

You can use simple vectorial operations:

def norm(df):
    MIN = df.min(1)
    MAX = df.max(1)
    return df.sub(MIN, 0).div(MAX-MIN, 0)
    
df2 = norm(df)

output:

     c1   c2   c3
0  0.25  0.0  1.0
1  0.00  0.0  1.0
2  0.00  0.5  1.0

axis-aware version:

def norm(df, axis=1):
    MIN = df.min(axis)
    MAX = df.max(axis)
    return df.sub(MIN, 1-axis).div(MAX-MIN, 1-axis)
    
norm(df, axis=0)

output:

         c1        c2   c3
0  0.000000  0.000000  0.5
1  1.000000  0.833333  0.0
2  0.333333  1.000000  1.0
  • Related