I'm trying and researching a lot how to do this, but I'm having trouble mixing pandas with if, else and/or get values by index and compare it with if, else and assign a column with codes/values. Explanation: I have this table below, I want to compare the cells in the ID column, and if the value of the posterior cell is equal to the previous one AND if in the COD column the posterior cell is equal to the previous one, THEN Result column = "no", otherwise "pass" and if neither then "unknown"
This is the formula in excel that I made: =IF(B3=B2,IF(C3=C2,"NO","PASS"),"UNKNOWN").
Below I have also posted some code attempts. I can even create two columns with the first test (from the ID column cells) and the second test (from the COD column cells), and return with Boolean results, but I can't get the If, Else to join it all together and generate the values I want in another column. Would I appreciate it if someone could help me?
df = df.sort_values(by=['ID'])
df['matchesID'] = df['ID'].shift(1) == df['ID']
df['matchesCod']= df['Cod'].shift(1) == df['Cod']
or
df = df.sort_values(by=['ID'])
test = (df['SWENo'].shift(1) == df['SWENo']) & (df['Cod'].shift(1) == df['Cod'])
I was trying something like this below
if df['ID'].shift(1) == df['ID'] and df['Cod'].shift(1) == df['Cod']:
listProg.append('not')
elif df['ID'].shift(1) == df['ID'] and df['Cod'].shift(1) != df['Cod']:
listProg.append('pass')
else:
listProg.append('Unknown')
But the result is: "ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()".
If you can help me I appreciate it, it can be with pandas or not or mixing. I just need it to work. Thank you guys.
CodePudding user response:
Similar approach in pandas will be to use numpy.where
function.
With this code:
import numpy as np
df['Result'] = np.where(df['ID'] == df['ID'].shift(), np.where(df['Cod'] == df['Cod'].shift(), 'NO', 'PASS'), 'UNKNOWN')
I get below results:
ID Cod Result
0 1 1 UNKNOWN
1 2 1 UNKNOWN
2 2 1 NO
3 3 1 UNKNOWN
4 4 1 UNKNOWN
5 4 2 PASS
6 4 2 NO
7 5 1 UNKNOWN
8 6 1 UNKNOWN
which seems more inline with your description of how Result value is derived.