I have a very very big data frame, I want to change the values of the data-frame to three different number '0,1,2'. If the value<0.7, value= 0
, if 0.7<value<0.9
then, value=1, else value=2. I used this if loop for that, however, it takes a lot of time to run. Do you have any solution for that? Thanks
import pandas as pd
df = pd.DataFrame()
df['id'] = [1, 2, 3]
df['a'] = [0, 0, 0]
df['b'] = [0.12, 0.12, 0.12]
df['c'] = [0.25, 0.25, 0.25]
df['d'] = [0.375, 0.375, 0.375]
df['e'] = [0.5,0.5,0.5 ]
df['f'] = [0.625, 0.625, 0.625]
df['g'] = [0.75,0.75,0.75]
df['x'] = [0.87, 0.87, 0.87]
df['x0'] = [0.98,0.98, 0.98]
df['a1'] = [0, 0, 0]
df['b1'] = [0.12, 0.12, 0.12]
df['c1'] = [0.25, 0.25, 0.25]
df['d2'] = [0.375, 0.375, 0.375]
df['e2'] = [0.5,0.5,0.5 ]
df['f2'] = [0.625, 0.625, 0.625]
df['g2'] = [0.75,0.75,0.75]
df['x2'] = [0.87, 0.87, 0.87]
df['x00'] = [0.98,0.98, 0.98] ```
Here is my desire output:
CodePudding user response:
You could stack
it, cut
it to get the codes, then unstack
it:
out = (pd.cut(df.set_index('id').stack(),
[0, 0.7, 0.9, float('inf')], [0,1,2],
include_lowest=True).cat.codes
.unstack().reset_index())
Output:
id a b c d e f g x x0 a1 b1 c1 d2 e2 f2 g2 x2 x00
0 1 0 0 0 0 0 0 1 1 2 0 0 0 0 0 0 1 1 2
1 2 0 0 0 0 0 0 1 1 2 0 0 0 0 0 0 1 1 2
2 3 0 0 0 0 0 0 1 1 2 0 0 0 0 0 0 1 1 2