Home > Blockchain >  Regular expression to ignore each first and second word of a selected sentence
Regular expression to ignore each first and second word of a selected sentence

Time:12-09

I want to create a Regular expression to ignore each first and second word of a selected sentence

For example I have this phrase "October 27 New Store Products / October 2022". I want to create a regex that will choose only this part of the phrase ~ "New Store Products / October 2022" and ignore the first date part of the phrase ~ "October 27".

CodePudding user response:

Without knowledge of your true requirements, all we can do is provide best guess, so here is mine;

What you could do, is have something such as the following;

/^\S \s \S \s (.*)$/

What this would do is the following;
From the beginning of the string (^), find one or more non-whitespace chars (\S ), find one or more whitespace chars (\s ) - repeat this again and then use a capture group ((.*)) to get everything else until the end of the string ($).

If you are using JavaScript, you could use this as such;

let sentence = "October 27 New Store Products / October 2022";
let regex = /^\S \s \S \s (.*)$/;
let match = regex.exec(sentence);

if (match) {
  // Ignores the first and second words of the sentence
  console.log(match[1]); // Output: "New Store Products / October 2022" ignoring "October 27"
}


Further explanation of this regex taken from regex1011 when this is put into the regex bar

/^\S \s \S \s (.*)$/

^ asserts position at start of the string
\S matches any non-whitespace character (equivalent to [^\r\n\t\f\v ])
matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
\s matches any whitespace character (equivalent to [\r\n\t\f\v ])
matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
\S matches any non-whitespace character (equivalent to [^\r\n\t\f\v ])
matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
\s matches any whitespace character (equivalent to [\r\n\t\f\v ])
matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
1st Capturing Group (.*)
. matches any character (except for line terminators)
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
$ asserts position at the end of the string, or before the line terminator right at the end of the string (if any)


1 Emphasis mine

CodePudding user response:

You've not provided any information on the context, but does it need to be a regular expression?

String manipulation by searching on spaces might be easier.

For example in PHP:

$string = "October 27 New Store Products / October 2022";
$string_array = explode(' ', $string, 3);
if (array_key_exists(2, $string_array)) echo $string_array[2];

or Excel:

=RIGHT(A1,LEN(A1)-FIND(" ",A1,FIND(" ",A1) 1))
  • Related