I'd like to create an extra column in python either using Pandas or Numpy based on iterating conditions ( I think that's the way to do it). ** If any value is "False" and with the same IDxx, then the extra column is IN otherwise is OUT**
| IDxx | Tru/Fal |
| ------ | -------- |
| 164 | True |
| 164 | False |
| 164 | False |
| 165 | True |
| 165 | True |
| 165 | True |
| 166 | False |
| 166 | True |
| 166 | True |
| 167 | True |
| 167 | True |
| 167 | False |
I tried a few options but I'm running out of ideas. As all IDxx's are different I can't get the loop working. There are only 4 IDxx's in this example, but in my real case, there are hundreds. I'd like the output to return the following
IDxx | Tru/Fal | Answer |
---|---|---|
164 | True | IN |
164 | False | IN |
164 | False | IN |
165 | True | OUT |
165 | True | OUT |
165 | True | OUT |
166 | False | IN |
166 | True | IN |
166 | True | IN |
167 | True | IN |
167 | True | IN |
167 | False | IN |
CodePudding user response:
Use groupby
and replace
as follows.
idx_gb = (df.groupby('IDxx')['Tru/Fal'].min() == False).to_dict() # key: IDxx, value: IDxx includes False or not
df['Answer'] = df['IDxx'].replace(idx_gb)