Home > OS >  how do I use .assign for values in a column
how do I use .assign for values in a column

Time:02-28

I have a dataframe which looks like this:

        date        symbol      numerator   denominator
4522    2021-10-06  PAG.SG      1.0         18
1016    2020-11-23  IPA.V       1.0         5
412     2020-04-17  LRK.AX      1.0         30
1884    2021-06-03  BOUVETO.ST  1.0         1
2504    2021-04-28  VKGYO.IS    1.0         100
3523    2021-07-08  603355.SS   1.0         1
3195    2021-08-23  IDAI        1.0         1
3238    2021-08-19  6690.TWO    1.0         1000
3430    2021-07-19  CAXPD       1.0         10
2642    2021-04-15  035720.KS   1.0         1

dtypes: date: object symbol: object numerator: float64 denominator: int64

I am trying to use pd.assign to assign a classifier to this df in the form of

df = df.assign(category = ['forward' if numerator > denominator else 'reverse' for numerator in df[['numerator', 'denominator']]])

But I'm receiving a TypeError stating: TypeError: Invalid comparison between dtype=int64 and str

I have tried casting them explicitly, with:

df = df.assign(category = ['forward' if df['numerator'] > df['denominator'] else 'reverse' for df['numerator'] in df])

But receive another TypeError stating: TypeError: '>' not supported between instances of 'str' and 'int'

Which is confusing because I'm not comparing strings, I'm comparing int and float.

Any help would be greatly appreciated.

CodePudding user response:

You still can do that with np.where

import numpy as np 
df = df.assign(category = np.where(df['numerator']>df['denominator'],
                                   'forward',
                                   'reverse') 
  • Related