Home > Net >  divide columns vertically by max value in a dataframe
divide columns vertically by max value in a dataframe

Time:12-12

I have a following table:

id summary summary_len apple book computer
1 .... 210 2 1 0
2 ... 120 3 0 1
3 ... 50 2 2 1

summary is basically some sort of description, summary_len <- the length of those descriptions and the rest - apple/book/computer and the keywords and the values presented in the table - those are the occurrences in each description.

I need to normalize this table, in a way to find max value - PER COLUMN (vertically) and then divide by this value, so the output will be as below (I put it in a format 2/3 - just to emphasis max value per column):

id summary summary_len apple book computer
1 .... 210 2/3 1/2 0/1
2 ... 120 3/3 0/2 1/1
3 ... 50 2/3 2/2 1/1

My problem here is that I don't have to find max in each columns - only for those keywords, which I am checking the occurrences for. I stored them in a list and got max value per column:

max_per_col = df_freq[keywords].max()
max_per_col

this is how it looks (with the original data): enter image description here

Could you help me apply it "back" to the former dataframe and divide vertically each column by the max value?

CodePudding user response:

You can divide only filtered columns by maximal values:

keywords = ['apple','book','computer']

df_freq[keywords] /= df_freq[keywords].max()

#working like
#df_freq[keywords] = df_freq[keywords] / df_freq[keywords].max()
print (df_freq)
   id  summary_len     apple  book  computer
0   1          210  0.666667   0.5       0.0
1   2          120  1.000000   0.0       1.0
2   3           50  0.666667   1.0       1.0
  • Related