Home > Back-end >  How to create column with the numbers 0 or 1 when substraction two columns in the same dataframe is
How to create column with the numbers 0 or 1 when substraction two columns in the same dataframe is

Time:09-27

In a simple example, I have a dataframe that looks like this (I am going to put in a dict structure, but it is really a dataframe):

data = {'value': [1,2,3,4,5,1,3,4,6],
        'limit': [4,4,4,4,3,3,3,1,1],
        }
data = pd.Dataframe(data)

I need add an extra column to dataframe that look like this:

data = {'value': [1, 2, 3, 4, 5, 1, 3, 4, 6],
        'limit': [4, 4, 4, 4, 3, 3, 3, 1, 1],
        'criteria': [0, 0, 0, 0, 1, 0, 0, 1, 1]
        }

    ''' The criteria is determined as follows (example):
    Looking at the row zero: 
value is 1. 
limit is 4. 
diff is -3. 
Given that -3 < 0 (less then 0). 
Criteria is 0.
    
    Looking at the row 4: 
value is 5. 
limit is 3. 
diff is 2 (higher than 0). 
Criteria is 1.'''


data['criteria'] = data[((data['value'] - df1['limit']) > 0) == 1]

Nevertheless, it does not work because I am not really assigning any values to the criteria column.

CodePudding user response:

A possible solution, which uses numpy.where:

data['criteria'] = np.where((data['value'] - data['limit']) > 0, 1, 0)

Output:

   value  limit  criteria
0      1      4         0
1      2      4         0
2      3      4         0
3      4      4         0
4      5      3         1
5      1      3         0
6      3      3         0
7      4      1         1
8      6      1         1

CodePudding user response:

You can get your desired result by simply computing the test as a boolean and converting to int:

data['criteria'] = ((data['value'] - data['limit']) > 0).astype(int)

Output:

   value  limit  criteria
0      1      4         0
1      2      4         0
2      3      4         0
3      4      4         0
4      5      3         1
5      1      3         0
6      3      3         0
7      4      1         1
8      6      1         1

If you just want to know how many times the test was true, simply sum the test results:

count = sum((data['value'] - data['limit']) > 0)

Output:

3
  • Related