I want to create new column and assign value using the logic below:
if IN>OUT then give value 1
else give value 0
This code below works fine but I would like to have something more "readable" like in other language, say SAS.
df = pd.DataFrame({'id': [1,2,3,4,5,6,7,8,9],
'In': [111, 100, 31, 1100, 12, 33, 21, 32, 33],
'Out': [24, 52, 34, 95, 98, 54, 32, 20, 16]})
print(df)
conditions = [
(df['In'] >= df['Out']),
df['In'] < df['Out']]
choices = [df['In'].shift(1), 0]
df['check'] = np.select(conditions, choices, default=np.nan)
print(df)
CodePudding user response:
Since you only have two conditions, just use np.where
:
df['check'] = np.where(df['In'] >= df['Out'], df['In'].shift(), 0)
>>> df
id In Out check
0 1 111 24 NaN
1 2 100 52 111.0
2 3 31 34 0.0
3 4 1100 95 31.0
4 5 12 98 0.0
5 6 33 54 0.0
6 7 21 32 0.0
7 8 32 20 21.0
8 9 33 16 32.0
>>>
Or if you have more, write a function:
def func(x):
if x['In'] >= x['Out']:
if x.name:
return df.loc[x.name - 1 , 'In']
else:
return np.nan
elif x['In'] < x['Out']:
return 0
return np.nan
df['check'] = df.apply(func, axis=1)
print(df)
Output:
id In Out check
0 1 111 24 NaN
1 2 100 52 111.0
2 3 31 34 0.0
3 4 1100 95 31.0
4 5 12 98 0.0
5 6 33 54 0.0
6 7 21 32 0.0
7 8 32 20 21.0
8 9 33 16 32.0