Is there a way of using isin() as calculator function for another column in pandas dataframe?-CodePudding

I have a column as 'PRODUCT_ID' in my pandas dataframe. I want to create a calculated column based on this column that PRODUCT_IDs in [3, 5, 8] will be taking value 'old' and others 'new'.

Right now I'm using a for loop to check every single index of the dataframe.

portfoy['PRODUCT_TYPE'] = np.nan

for ind in portfoy.index:
    if portfoy.loc[ind, 'PRODUCT_CODE'] in [3, 5, 8]:
        portfoy.loc[ind, 'PRODUCT_TYPE'] = 'old'
    else:
        portfoy.loc[ind, 'PRODUCT_TYPE'] = 'new'

This code seems to take a lot of time. Is there a better way to do this?

My data looks like:

CUSTOMER	PRODUCT_ID	other columns
2345	3	-------------
3456	5	-------------
2786	5	-------------

CodePudding user response：

Use numpy.where with Series.isin for vectorized fast solution:

portfoy['PRODUCT_TYPE'] = np.where(portfoy['PRODUCT_CODE'].isin([3, 5, 8]), 'old', 'new')

CodePudding user response：

you can use masks to conditional update the data frame

portfoy.loc[portfoy.PRODUCT_CODE.isin([3,5,8]),'PRODUCT_TYPE'] = 'old'

portfoy.loc[~portfoy.PRODUCT_CODE.isin([3,5,8]),'PRODUCT_TYPE'] = 'new'

portfoy.PRODUCT_CODE.isin([3,5,8] is the mask
~ is the negation of the mask