Home > database >  Delete a word from a string which contains hashtags
Delete a word from a string which contains hashtags

Time:05-14

I have already done a lot of "filtering" with regexp to remove unwanted characters from a string, this is what i am using:

var regexpHashtag = new RegExp(/(?:^|\s)(?:#)([a-zA-Z\d] )/g)
var regexpUrl = new RegExp(/(?:https?|ftp):\/\/[\n\S] /g)
var regexpEmoji = new RegExp(/([\u2700-\u27BF]|[\uE000-\uF8FF]|\uD83C[\uDC00-\uDFFF]|\uD83D[\uDC00-\uDFFF]|[\u2011-\u26FF]|\uD83E[\uDD10-\uDDFF])/g)
var regexpQuotes = new RegExp(/['"] /g)

tweetText = tweetText.replace(regexpHashtag, '')
tweetText = tweetText.replace(regexpUrl, '')
tweetText = tweetText.replace(regexpEmoji, '')
tweetText = tweetText.replace(regexpQuotes, '')

but still there are cases where hashtag persists, for example before filtering:

Pogledajte prizore koje je naš fotograf danas zabilježio na Ilidži (FOTO)             
  • Related