I am trying to implement my own function with the data set below:
import pandas as pd
import numpy as np
data = {
'sales': ['0','1','2','2','6','5','6'],
}
df = pd.DataFrame(data, columns = ['sales'])
df
Now I want to apply my function how will give 1 only on value '2', 2 only on values '6', and for all others will give '3'. In order to do this, I try this function :
def function_test(data):
sales = df['sales']
conditions = [
(sales == '6'),
(sales == '2'),
(sales <> '6'&'2') #<----This row
]
values = [
1,
2,
3
]
dummy = np.select(conditions, values)
return (dummy)
But this function has a problem for third conditions, so can anybody help me how to solve this problem?
CodePudding user response:
One way to fix it is to use !=
instead of <>
and use two comparisons. (I also changed the condition sequence to match what you described in the text of your question):
((sales != '6') & (sales != '2'))
Full test code:
def function_test(data):
sales = df['sales']
conditions = [
(sales == '2'),
(sales == '6'),
((sales != '6') & (sales != '2'))
]
values = [1, 2, 3]
dummy = np.select(conditions, values)
return (dummy)
print(function_test(data))
Results:
[3 3 1 1 2 3 2]