Trying to create a function X(df): replaces the values of the FIRST column of the dataframe as per the following criteria:
- If the value is a number between 0 and 0.5 (so 0 <= value <= 0.5), replace this value with the sum of the values of all columns in this row.
- If the value is between 1.0 and 2.0 (so 1.0 <= value <= 2.0), replace this value with -99. (if in part 1. the original value is 0.1 and the sum of all columns (in that row) is 1.5, this value will be then replaced by -99 in part 2.)
original df:
|idx| |A| |B|
|0| |0.4| 1.0
|1| |0.0| 0.5
|2| |10.0| 0.0
|3| |1.5| -100.0
|4| |0.1| 0.1
|5| |0.5| -10.0
I have this so far:
def X(df):
for i in df.iloc[:, 0]:
if (i >= 0) and (i <= 0.5):
df.iloc[:,0] = df.sum(axis=1)
elif (i>=1) and (i<=2):
df.iloc[:,0] = int(-99)
else:
continue
return df
'''
I got:
A B
idx
0 3.4 1.0
1 1.5 0.5
2 10.0 0.0
3 -298.5 -100.0
4 0.4 0.1
5 -29.5 -10.0
I was expecting:
A B
idx
0 0.5 1.0
1 0.5 0.5
2 10.0 0.0
3 -99 -100.0
4 0.2 0.1
5 -9.5 -10.0
CodePudding user response:
Example
data = {'A': {0: 0.4, 1: 0.0, 2: 10.0, 3: 1.5, 4: 0.1, 5: 0.5},
'B': {0: 1.0, 1: 0.5, 2: 0.0, 3: -100.0, 4: 0.1, 5: -10.0}}
df = pd.DataFrame(data)
output(df
):
A B
0 0.4 1.0
1 0.0 0.5
2 10.0 0.0
3 1.5 -100.0
4 0.1 0.1
5 0.5 -10.0
Code
use np.select
import numpy as np
cond1 = (df['A'] >= 0) & (df['A'] <= 0.5)
cond2 = (df['A'] >= 1) & (df['A'] <= 2)
np.select([cond1, cond2], [df.sum(axis=1), -99], df['A'])
result:
array([ 1.4, 0.5, 10. , -99. , 0.2, -9.5])
Final
make result to column A
df.assign(A=np.select([cond1, cond2], [df.sum(axis=1), -99], df['A']))
desired output:
A B
0 1.4 1.0
1 0.5 0.5
2 10.0 0.0
3 -99.0 -100.0
4 0.2 0.1
5 -9.5 -10.0
CodePudding user response:
for idx, i in df.iterrows():
if i[0] >= 1.0 and i[0] <= 2.0:
i[0] = -99
elif i[0] >= 0 and i[0] <= 0.5:
if sum(i) >= 1.0 and sum(i) <= 2.0:
i[0] = -99
else:
i[0] = sum(i)
return df