Home > Mobile >  If dataframe column has specific words alter value
If dataframe column has specific words alter value

Time:12-25

I have a dataframe, example:

df = [{'id': 1, 'text': 'text contains ok words'}, , {'id':2, 'text':'text contains word apple'}, {'id':3, 'text':'text contains words ok'}]

Example:

keywords = ['apple', 'orange', 'lime']

And I want to check all columns 'text' to check if contains any word from my keywords, if so I want to alter that text column to: 'disconsider this case'

I've tried to tokenize the column but then I'm not able to use the function I created to check, here is the example:

df = pd.DataFrame(df)

def remove_keywords(inpt):
    keywords = ['apple', 'orange', 'lime']    
    if any(x in word for x in keyword):
        return 'disconsider this case'
    else:
        return inpt

        
df['text'] = df['text'].apply(remove_keywords)
df
df['text'] = df.apply(lambda row: nltk.word_tokenize(row['text']), axis=1)
for word in df['text']:
    if 'apple' in df['text']:
        return 'disconsider this case'

Any help appreciated. Thanks!!

CodePudding user response:

this worked for me using pandas and a loop

import pandas as pd
keywords=['apple', 'orange', 'lime']
df = pd.DataFrame([{'id': 1, 'text': 'text contains ok words'}, {'id':2, 'text':'text contains word apple'}, {'id':3, 'text':'text contains words ok'}])
print(df)
for i in range(len(df)):
        if any(word in df.iat[i,1] for word in keywords):
            df.iat[i,1]='discondider in this case'
print(df)
  • Related