I am beginner in Python and have the following df:
print(df)
>>>
NAME KIDS TEENS
Steve 1 0
John 0 1
Peter 0 2
Frank 0 0
Jessica 1 0
Donny 0 0
My goal is to create a new column called "Children", which indicates - based on the values from the columns KIDS and TEENS - if the person has Children or not. My idea would be to use an IF-Statement.
IF "KIDS" "TEENS" > 0 THEN CHILDREN "YES" ELSE "NO".
Is this a good approach and how can I achieve it?
The final result should looks something like this:
print(df)
>>>
NAME KIDS TEENS CHILDREN
Steve 1 0 YES
John 0 1 YES
Peter 0 2 YES
Frank 0 0 NO
Jessica 1 0 YES
Donny 0 0 NO
CodePudding user response:
You can use the following
df['CHILDREN'] = (df['KIDS'] | df['TEENS']).astype(bool)
Because 0
is Falsey and any number is True, this will be False
if both are 0
and True
if either KIDS
or TEENS
is > 0.
I would recommend using boolean values (True/False) to represent this, but if you want the strings you could follow this with df['CHILDREN'].replace({True:'Yes', False: 'No'})
CodePudding user response:
You can use numpy.where
df['CHILDREN'] = np.where(df.KIDS df.TEENS > 0, 'YES', 'NO')
Output:
>>> df
NAME KIDS TEENS CHILDREN
0 Steve 1 0 YES
1 John 0 1 YES
2 Peter 0 2 YES
3 Frank 0 0 NO
4 Jessica 1 0 YES
5 Donny 0 0 NO
Setup used:
import pandas as pd
import numpy as np
df = pd.DataFrame({
'NAME': ['Steve', 'John', 'Peter', 'Frank', 'Jessica', 'Donny'],
'KIDS': [1, 0, 0, 0, 1, 0],
'TEENS': [0, 1, 2, 0, 0, 0]
})
CodePudding user response:
here is one way to do it
df['Children'] = (df['KIDS'] df['TEENS']).astype(bool)
OR
df['Children'] = df.sum(axis=1).astype(bool)
NAME KIDS TEENS Children
0 Steve 1 0 True
1 John 0 1 True
2 Peter 0 2 True
3 Frank 0 0 False
4 Jessica 1 0 True
5 Donny 0 0 False