Home > Mobile >  standardise data using list comprehension
standardise data using list comprehension

Time:12-21

I'm trying to standardise observations for a group of columns in a dataframe I have without using any in built functions.

I have a list of the columns I want to standardise held in an object called continuous and I'm trying to use list comprehension to apply the standardisation.

I'm having trouble coming up with an approach that allows me to iterate over the rows in my dataframe.

What I've got so far:

continuous = [1, 2, 3, 5, 6, 7, 8, 9, 10]

data_z = [(data[col][i] for i in data.index)-data.mean(col)/data.std(col) for col in continuous]

This is spitting out a type error - it won't let me iterate over a generator object, so I'm wondering if anyone knows the correct approach to iterate over the rows and columns I want to standardise?

Thanks in advance!

CodePudding user response:

Couldn't you just as easily do

mean = data[continuous].mean(axis='rows')
std = data[continuous].std(axis='rows')

data_z = (data[continuous] - mean ) / std

CodePudding user response:

Use DataFrame.sub with DataFrame.div with filtered columns in df1:

np.random.seed(2021)    
data = pd.DataFrame(np.random.randint(10, size=(13, 13)))

continuous = [1, 2, 3, 5, 6, 7, 8, 9, 10]

df1 = data[continuous]
data_z = df1.sub(df1.mean()).div(df1.std())

print (data_z)
          1         2         3         5         6         7         8   \
0   0.361158  1.309808 -1.365801  0.092504  0.739880  0.553102  0.549700   
1  -1.083473 -1.038813  0.309238 -1.410680 -0.977698 -0.944883 -1.314501   
2  -1.083473  0.429075  0.979254  0.393140  1.083396  0.253505  0.549700   
3  -0.722315  1.016230  0.309238 -1.110043 -1.664730  1.451893 -0.693101   
4   1.805788 -0.158080  1.649269  1.295050  0.396364  0.852699 -0.071700   
5  -0.722315 -1.332391 -0.025770 -0.508770 -1.664730  1.451893 -0.071700   
6  -1.083473  0.135497  0.979254  1.295050 -0.290667 -1.244480 -1.314501   
7   1.083473 -1.332391 -0.360778  0.393140  1.083396 -0.345689  1.481801   
8   1.444630  1.309808 -1.365801 -1.410680  0.396364 -1.244480 -1.003801   
9   0.000000 -1.038813  0.979254 -0.208133  1.083396  0.553102  0.549700   
10 -0.361158 -0.745236 -1.365801  0.693777 -0.634183  0.553102 -1.003801   
11 -0.361158  0.722653 -0.695785  1.295050 -0.290667 -0.645286  0.860401   
12  0.722315  0.722653 -0.025770 -0.809406  0.739880 -1.244480  1.481801   

          9         10  
0   0.369274  0.301124  
1  -1.107823  0.301124  
2   1.477098 -1.264720  
3   0.738549 -0.090337  
4  -0.738549 -1.264720  
5  -1.107823 -0.873259  
6  -0.369274  1.475507  
7  -1.477098 -0.873259  
8  -0.738549  0.301124  
9   0.738549 -0.481798  
10  0.738549 -0.481798  
11  1.477098  1.475507  
12  0.000000  1.475507  
  • Related