In a simple example, I have a dataframe that looks like this (I am going to put in a dict structure, but it is really a dataframe):
data = {'value': [1,2,3,4,5,1,3,4,6],
'limit': [4,4,4,4,3,3,3,1,1],
}
data = pd.Dataframe(data)
I need add an extra column to dataframe that look like this:
data = {'value': [1, 2, 3, 4, 5, 1, 3, 4, 6],
'limit': [4, 4, 4, 4, 3, 3, 3, 1, 1],
'criteria': [0, 0, 0, 0, 1, 0, 0, 1, 1]
}
''' The criteria is determined as follows (example):
Looking at the row zero:
value is 1.
limit is 4.
diff is -3.
Given that -3 < 0 (less then 0).
Criteria is 0.
Looking at the row 4:
value is 5.
limit is 3.
diff is 2 (higher than 0).
Criteria is 1.'''
data['criteria'] = data[((data['value'] - df1['limit']) > 0) == 1]
Nevertheless, it does not work because I am not really assigning any values to the criteria column.
CodePudding user response:
A possible solution, which uses numpy.where
:
data['criteria'] = np.where((data['value'] - data['limit']) > 0, 1, 0)
Output:
value limit criteria
0 1 4 0
1 2 4 0
2 3 4 0
3 4 4 0
4 5 3 1
5 1 3 0
6 3 3 0
7 4 1 1
8 6 1 1
CodePudding user response:
You can get your desired result by simply computing the test as a boolean and converting to int
:
data['criteria'] = ((data['value'] - data['limit']) > 0).astype(int)
Output:
value limit criteria
0 1 4 0
1 2 4 0
2 3 4 0
3 4 4 0
4 5 3 1
5 1 3 0
6 3 3 0
7 4 1 1
8 6 1 1
If you just want to know how many times the test was true, simply sum
the test results:
count = sum((data['value'] - data['limit']) > 0)
Output:
3