Home > Software engineering >  Regex to Match Non Hashtag words in Unicode text
Regex to Match Non Hashtag words in Unicode text

Time:12-24

I am struggling to write a ruby regex which can detect non hash tag words in a Unicode string. I am aware of this answer here . This fails to detect unicode characters Live demo Java’s regular expression syntax can also be appreciated .

Example :
Input : #bulls gonna overtake the #bears soon #ATH coming #ALTSEASON #BSCGem #eth #btc #memecoin #100xgems #satyasanata
@Prakhar #सुप्रभात आपके लिए हार्दिक शुभकामनाएं आपका दिन मंगलमय हो #GoodMorning Luv from India

O/P :  gonna overtake the soon coming @Prakhar आपके लिए हार्दिक शुभकामनाएं आपका दिन मंगलमय हो Luv from India

CodePudding user response:

You can assert a whitespace boundary to the left (?<!\S) and then match 1 or more characters other than a whitspace character or a # character using a negated character class.

(?<!\S)[^\s#] 

See a Rubular demo.

  • Related