Home > Net >  Error tokenizing remove pattern re.findall
Error tokenizing remove pattern re.findall

Time:06-07

data

I have error like this while cleaning text, i just tried to following code from web

def remove_pattern(text, pattern):
    r = re.findall(pattern, text)
    for i in r:
        text = re.sub(i, '', text)
    return text

df['remove_user'] = np.vectorize(remove_pattern)(df['Comment'], "@[\w]*")

And I got this error:

Error

CodePudding user response:

Use str.replace here:

df["remove_user"] = df["Comment"].str.replace(r'\W ', '', regex=True)
  • Related