I am struggling to write a ruby regex which can detect non hash tag words in a Unicode string. I am aware of this answer here . This fails to detect unicode characters Live demo Java’s regular expression syntax can also be appreciated .
Example :
Input : #bulls gonna overtake the #bears soon #ATH coming #ALTSEASON #BSCGem #eth #btc #memecoin #100xgems #satyasanata
@Prakhar #सुप्रभात आपके लिए हार्दिक शुभकामनाएं आपका दिन मंगलमय हो #GoodMorning Luv from India
O/P : gonna overtake the soon coming @Prakhar आपके लिए हार्दिक शुभकामनाएं आपका दिन मंगलमय हो Luv from India
CodePudding user response:
You can assert a whitespace boundary to the left (?<!\S)
and then match 1 or more characters other than a whitspace character or a #
character using a negated character class.
(?<!\S)[^\s#]
See a Rubular demo.