Home > Mobile >  Replace the values of the big data frame with another values
Replace the values of the big data frame with another values

Time:04-12

I have a very very big data frame, I want to change the values of the data-frame to three different number '0,1,2'. If the value<0.7, value= 0, if 0.7<value<0.9 then, value=1, else value=2. I used this if loop for that, however, it takes a lot of time to run. Do you have any solution for that? Thanks

import pandas as pd
df = pd.DataFrame()
df['id'] = [1, 2, 3]
df['a'] = [0, 0, 0]
df['b'] = [0.12, 0.12, 0.12]
df['c'] = [0.25, 0.25, 0.25]
df['d'] = [0.375, 0.375, 0.375]
df['e'] = [0.5,0.5,0.5 ]
df['f'] = [0.625, 0.625, 0.625]
df['g'] = [0.75,0.75,0.75]
df['x'] = [0.87, 0.87, 0.87]
df['x0'] = [0.98,0.98, 0.98]
df['a1'] = [0, 0, 0]
df['b1'] = [0.12, 0.12, 0.12]
df['c1'] = [0.25, 0.25, 0.25]
df['d2'] = [0.375, 0.375, 0.375]
df['e2'] = [0.5,0.5,0.5 ]
df['f2'] = [0.625, 0.625, 0.625]
df['g2'] = [0.75,0.75,0.75]
df['x2'] = [0.87, 0.87, 0.87]
df['x00'] = [0.98,0.98, 0.98] ```

Here is my desire output:

enter image description here

CodePudding user response:

You could stack it, cut it to get the codes, then unstack it:

out = (pd.cut(df.set_index('id').stack(), 
              [0, 0.7, 0.9, float('inf')], [0,1,2], 
              include_lowest=True).cat.codes
       .unstack().reset_index())

Output:

   id  a  b  c  d  e  f  g  x  x0  a1  b1  c1  d2  e2  f2  g2  x2  x00
0   1  0  0  0  0  0  0  1  1   2   0   0   0   0   0   0   1   1    2
1   2  0  0  0  0  0  0  1  1   2   0   0   0   0   0   0   1   1    2
2   3  0  0  0  0  0  0  1  1   2   0   0   0   0   0   0   1   1    2
  • Related