I have a dataframe as follow:
mea1 | mea2 | mea3 |
---|---|---|
0.38 | 0.11 | 0.02 |
0.32 | 0.12 | 0.03 |
I would like to check each row in the columns for the below conditions and store the results (0 or 1) in a new column called FC.
If mea1 is between 0.3 and 0.4 then FC = 1, else 0
If mea2 is between 0.10 and 0.11 and FC != 0, then FC = 1, else 0
If mea3 is between 0.01 and 0.05 and FC != 0, then FC = 1, else 0
Result: (1): met and (0): failed
mea1 | mea2 | mea3 | FC |
---|---|---|---|
0.38 (1) | 0.11 (1) | 0.02 (1) | 1 |
0.32 (1) | 0.12 (0) | 0.03 (1) | 0 |
I was able to achieve the result by using nested for-loops but the program took forever to run on a table with 10,000 entries.
Here is my latest attempt at using the lambda function without checking for FC. col = mea1 etc...:
df['FC'] = df.apply(lambda x: 1 if ((x[x[col] >= lower_limit]) & (x[x[col] <= upper_limit])) else 0 , axis = 1 )
Thanks!
CodePudding user response:
I think you can do it like this:
df['FC'] = (
(df['mea1'] >=0.3) & (df['mea1'] <= 0.4)
& (df['mea2'] >= 0.1) & (df['mea2'] <= 0.11)
& (df['mea3']>=0.01) & (df['mea3']<=0.05)
) * 1
CodePudding user response:
"Iteration" and pandas don't really mix. If you think the only way to do what you're thinking of doing is through iteration, you're probably wrong. Most of the time there is a fast, vectorized, solution.
df['FC'] = (df.mea1.between(0.3, 0.4, inclusive='both')
& df.mea2.between(0.1, 0.11, inclusive='both')
& df.mea3.between(0.01, 0.05, inclusive='both')).astype(int)
Output:
mea1 mea2 mea3 FC
0 0.38 0.11 0.02 1
1 0.32 0.12 0.03 0
CodePudding user response:
def assignFC(row: pd.Series):
fc = 0
if row[“mea1”] >= 0.3 and row[“mea1”] <= 0.4:
fc = 1
if row[“mea2”] >= 0.1 and row[“mea2”] <= 0.11 and fc==1:
fc = 1
else:
fc = 0
if row[“mea3”] >= 0.01 and row[“mea3”] <= 0.05 and fc==1:
fc = 1
else:
fc = 0
return fc
df[“FC”] = df.apply(assignFC, axis=1)