I have a dataframe which looks like this:
date symbol numerator denominator
4522 2021-10-06 PAG.SG 1.0 18
1016 2020-11-23 IPA.V 1.0 5
412 2020-04-17 LRK.AX 1.0 30
1884 2021-06-03 BOUVETO.ST 1.0 1
2504 2021-04-28 VKGYO.IS 1.0 100
3523 2021-07-08 603355.SS 1.0 1
3195 2021-08-23 IDAI 1.0 1
3238 2021-08-19 6690.TWO 1.0 1000
3430 2021-07-19 CAXPD 1.0 10
2642 2021-04-15 035720.KS 1.0 1
dtypes: date: object symbol: object numerator: float64 denominator: int64
I am trying to use pd.assign
to assign a classifier to this df in the form of
df = df.assign(category = ['forward' if numerator > denominator else 'reverse' for numerator in df[['numerator', 'denominator']]])
But I'm receiving a TypeError stating:
TypeError: Invalid comparison between dtype=int64 and str
I have tried casting them explicitly, with:
df = df.assign(category = ['forward' if df['numerator'] > df['denominator'] else 'reverse' for df['numerator'] in df])
But receive another TypeError stating:
TypeError: '>' not supported between instances of 'str' and 'int'
Which is confusing because I'm not comparing strings, I'm comparing int and float.
Any help would be greatly appreciated.
CodePudding user response:
You still can do that with np.where
import numpy as np
df = df.assign(category = np.where(df['numerator']>df['denominator'],
'forward',
'reverse')