Here's some made data:
import pandas as pd
import numpy as np
import operator
rows,cols = 8760,3
data = np.random.rand(rows,cols)
tidx = pd.date_range('2019-01-01', periods=rows, freq='1T')
df = pd.DataFrame(data, columns=['Mix_Temp','Outside_Temp','Return_Temp'], index=tidx)
How can I create another pandas column that is Boolean True or False based on a fault condition that I incorporated into a function:
def fault_condition_two_(dataframe):
return operator.truth(dataframe.Mix_Temp dataframe.mat_err < min((dataframe.Return_Temp - dataframe.rat_err) , (dataframe.Outside_Temp - dataframe.oat_err)))
There are some additional params
below if I try this:
# params
rat_err = 2.
mat_err = 5.
oat_err = 5.
# make columns out of params
df['rat_err'] = rat_err
df['mat_err'] = mat_err
df['oat_err'] = oat_err
# run data thru function
df['bool_flag'] = fault_condition_two_(df)
Ill get this famous ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
which I cant seem to find out a better solution. Any tips greatly appreciated.
Note that I did have luck using the Python operator with this function below on a different fault condition equation where this worked great. What's best practice? This function works fine, no errors:
def fault_condition_one(dataframe):
return operator.and_(dataframe.duct_static < dataframe.duct_static_setpoint, dataframe.vfd_speed >= dataframe.vfd_speed_percent_max - dataframe.vfd_speed_percent_err_thres)
CodePudding user response:
IIUC use numpy version for min
by numpy.minimum.
, here operator.truth
is not necessary:
def fault_condition_two_(dataframe):
return (dataframe.Mix_Temp dataframe.mat_err < np.minimum((dataframe.Return_Temp - dataframe.rat_err) , (dataframe.Outside_Temp - dataframe.oat_err)))
CodePudding user response:
An alternative using only direct dataframe manipulation:
((df.Mix_Temp df.mat_err) < (df.Return_Temp - df.rat_err))\
& ((df.Mix_Temp df.mat_err) < (df.Return_Temp - df.rat_err))