I'm trying to do some data validation with numpy and I don't really understand this error. see code below:
conditions = [
(np.where(df['DD'].eq('No DD Required'))),
(np.where(df['DD'].eq('DD Required'))) & (np.where(df['On Direct Debit'].eq('No'))),
(np.where(df['DD'].eq('DD Required'))) & (np.where(df['On Direct Debit'].eq('Yes')))
]
values = ['Pass', 'Fail', 'Pass']
df['DDValidation'] = np.select(conditions, values)
CodePudding user response:
Your &
operator is not the one you wanted. &
in python is a bitwise AND, not the logical AND, which is and
.
Change your code to something like
conditions = [
(np.where(df['DD'].eq('No DD Required'))),
(np.where(df['DD'].eq('DD Required'))) and (np.where(df['On Direct Debit'].eq('No'))),
(np.where(df['DD'].eq('DD Required'))) and (np.where(df['On Direct Debit'].eq('Yes')))
]
for more info, a discussion of the distinction between these different ANDs
CodePudding user response:
This has more of a chance of working. You haven't provided a sample df
, so I can't test it.
conditions = [
df['DD'].eq('No DD Required',
(df['DD'].eq('DD Required')) & (df['On Direct Debit'].eq('No')),
(df['DD'].eq('DD Required')) & (df['On Direct Debit'].eq('Yes'))
]
Look at what np.select
expects as its first argument:
condlist : list of bool ndarrays
Each of those lines needs to a bool numpy array.
When developing, and debugging, code, test each step. Don't guess or assume. Look at what np.where(...)
produces. Does that look at all like a bool array? And read the docs - for np.where
and np.select
for a start.