Home > Blockchain >  Python: Can I create a dummy based on search conditions in one column with text series?
Python: Can I create a dummy based on search conditions in one column with text series?

Time:12-18

I was wondering how I could create a dummy variable for the following condition: column 'lemmatised' contains at least two words from 'innovation_words'. Innovation_words is a list I defined myself:

innovation_words = ['community', 'local', 'charity', 'event', 'partner',
                'volunteering', 'plastic', 'surplusfood']

The lemmatised column looks like this (I'm fine changing the type or formatting if needed):

data to use for condition

So, if any observation includes for example local and plastic, I would like to have a dummy variable: 'innovation' = 1. Hope someone can help me with this. Some code I already tried:

conditions = [df_posts['lemmatised'].isin(innovation_words), 
          df_posts['lemmatised'].isin(innovation_words)]

dummy = [1,0]

df_posts['innovation'] = np.select(conditions, dummy)

CodePudding user response:

Maybe you can try this:

df_posts['innovation'] = 0 
df_posts.loc[df_posts.lemmatised.isin(innovation_words), 'innovation'] = 1


CodePudding user response:

Use from this code


df['new']=df.lemmatised.map(lambda w: len([i for i in innovation_words if i in w])>1)

just rename the variables

  • Related