Home > other >  Find dash but not in certain words
Find dash but not in certain words

Time:10-11

My request is pretty simple, I want to find all "-" but not in certain words like start-up or new-york.

I don't know if it's feasible with REGEX.

My list of excluded words is 4 or 5 words.

So my regex level stops at about [-].

EDIT:

Long story : I'm using the regex block in parabola.io which is used for finding a match using regex and replacing it with whatever floats your boat.

I have raw blocks of texts and figured out a simple way to "mimic" bullet point lists by simply adding a <br> before each dash found in those blocks. However, it does not work whenever there is a dash in a word like start-up or new-york.

So now I try to match all dashes except if they are found in a specific word.

CodePudding user response:

If you don't mind running few steps, that might work:

  • replace all dashes in each special words with a unique character sequence. The meaning is to "escape" the dashes in these words: (?<=start)-(?=up)|(?<=new)-(?=york) to MYUNIQUETEXT (confirm that your text does not have MYUNIQUETEXT in the first place)
  • then replace all - with <br>
  • then replace all unique character sequences MYUNIQUETEXT to -. That is the "unescaping".

That might be faster than designing a regular expression, for the sole purpose of do it in a single pass.

  • Related