I have the following string:
"By signing in, I agree to the {{#a}}[Terms of Use](https://www.example.com/termsofuse){{/a}} and {{#a}}[Privacy Policy](https://www.example.com/privacy){{/a}}."
And I am using the following regex to split the words while considering {{#a}}[Terms of Use](https://www.example.com/termsofuse){{/a}}
and {{#a}}[Privacy Policy](https://www.example.com/privacy){{/a}}
as whole words.
\s (?![^\[]*\])
My problem is that my current regex does not remove the full stop at the end of {{#a}}[Privacy Policy](https://www.example.com/privacy){{/a}}.
. Ideally I would like my regex to split full stops, exclamation marks and question marks. That being said, I'm not sure how would I differentiate between a full stop at the end of the word and a full stop that is part of the URL.
CodePudding user response:
You can try a variation of the following regular expression:
\s (?![^\[]*\])|(?=[\.?!](?![a-zA-Z0-9_%-]))
The new part being the alternation of (?=[\.?!](?![a-zA-Z0-9_%-]))
at the end. It performs a positive lookahead of a period, question mark or bang, using a negative lookahead to make sure it's not followed by a URL-ish looking character. You may need to adjust that character class in brackets to contain the characters you want to consider part of the URL.
CodePudding user response:
Instead of .split
you will be better off using .match
here using this regex:
\{\{#a}}.*?\{\{\/a}}/g
This matches {{#a}}
followed by 0 or of any character followed by {{/a}}
.
or else you may use this more strict regex match:
\{\{#a}}\[[^\]]*]\([^)]*\)\{\{\/a}}
Here:
\[[^\]]*]
: Matches[...]
substring\([^)]*\)
: Matches(...)
substring
var string = "By signing in, I agree to the {{#a}}[Terms of Use](https://www.example.com/termsofuse){{/a}} and {{#a}}[Privacy Policy](https://www.example.com/privacy){{/a}}.";
console.log( string.match(/\{\{#a}}.*?\{\{\/a}}/g) );