Home > Software engineering >  Split at hyphens if NOT preceded by a whitespace or IS followed by a whitespace
Split at hyphens if NOT preceded by a whitespace or IS followed by a whitespace

Time:10-28

I want to split my string at each hyphen but only if it's not preceded by a whitespace. However, if it is followed by a whitespace, I wanna split it and have the hyphen removed. The same should happen if the hyphen is the last character of the string.

Example:

myString = '- foo - -bar --baz -'

Let's say I call

myString.split(regExImLookingFor).join(' ').split(' ').filter(word => word !== '')

I want it to return the following array:

[foo, -bar, --baz]

Notice how all the orphaned hyphens disappeared. There were 3 of them in myString: One at the start, one in the middle, and one in the end.

CodePudding user response:

You may use match using this regex:

-*[\w #] (?:-[\w #] )*

Breakdown:

  • -*[\w #] : This pattern matches 0 or more hyphens followed by 1 word or or # characters.
  • (?:-[\w #] )*: matches 0 or more repeats of hyphenated words

RegEx Demo

Code:

var s = 'hyphenated-word - foo - -bar --baz - C   C# '

var arr = s.match(/-*[\w #] (?:-[\w #] )*/g);

console.log(arr);

//=> ["hyphenated-word", "foo", "-bar", "--baz", "C  ", "C#"]

CodePudding user response:

As indicated you would better not use split, but match for this. If however your requirement really is to have it work with the template code you provided, then the regExImLookingFor can be /(?:\s|- (?!\S)) /.

let myString = '- foo - -bar --baz aaa-bbb -';
let result = myString.split(/(?<![-\s])- (?![-\s])|(?:[-\s] (?:\s|$))/)
                     .join(' ').split(' ').filter(word => word !== '');
console.log(result);

With match it becomes easier though:

let myString = '- foo - -bar --baz aaa-bbb -';
let result = myString.match(/(?:(?<!\S)- )?[^-\s] /g);
console.log(result);

CodePudding user response:

You can use

myString.match(/-*[^\s-] /g)

See the regex demo.

Details:

  • -* - zero or more hyphens
  • [^\s-] - one or more chars other than whitespace and hyphens.

See the JavaScript demo:

myString = '- foo - -bar --baz -'
console.log(myString.match(/-*[^\s-] /g))

CodePudding user response:

You could use split with a pattern and specify the different scenario's:

   *-(?:  |$)|  (?=-)|\b-\b

Explanation

  • *-(?: |$) Match optional spaces, - and 1 spaces or the end of the string
  • | Or
  • (?=-) Match 1 spaces asserting - to the right
  • | Or
  • \b-\b A hyphen between word boundaries

Regex demo

const regex = / *-(?:  |$)|  (?=-)|\b-\b/;
[
  "- foo - -bar --baz -",
  "hyphenated-word"
].forEach(s => console.log(s.split(regex).filter(Boolean)));

  • Related