Home > Blockchain >  Is there a way of using isin() as calculator function for another column in pandas dataframe?
Is there a way of using isin() as calculator function for another column in pandas dataframe?

Time:10-01

I have a column as 'PRODUCT_ID' in my pandas dataframe. I want to create a calculated column based on this column that PRODUCT_IDs in [3, 5, 8] will be taking value 'old' and others 'new'.

Right now I'm using a for loop to check every single index of the dataframe.

portfoy['PRODUCT_TYPE'] = np.nan

for ind in portfoy.index:
    if portfoy.loc[ind, 'PRODUCT_CODE'] in [3, 5, 8]:
        portfoy.loc[ind, 'PRODUCT_TYPE'] = 'old'
    else:
        portfoy.loc[ind, 'PRODUCT_TYPE'] = 'new'

This code seems to take a lot of time. Is there a better way to do this?

My data looks like:

CUSTOMER PRODUCT_ID other columns
2345 3 -------------
3456 5 -------------
2786 5 -------------

CodePudding user response:

Use numpy.where with Series.isin for vectorized fast solution:

portfoy['PRODUCT_TYPE'] = np.where(portfoy['PRODUCT_CODE'].isin([3, 5, 8]), 'old', 'new')

CodePudding user response:

you can use masks to conditional update the data frame

portfoy.loc[portfoy.PRODUCT_CODE.isin([3,5,8]),'PRODUCT_TYPE'] = 'old'

portfoy.loc[~portfoy.PRODUCT_CODE.isin([3,5,8]),'PRODUCT_TYPE'] = 'new'

portfoy.PRODUCT_CODE.isin([3,5,8] is the mask
~ is the negation of the mask

  • Related