Home > database >  Matching a String Between Delimiters if and Only if it Contains a Specific ID Using REGEX
Matching a String Between Delimiters if and Only if it Contains a Specific ID Using REGEX

Time:09-27

I am trying to match a string between delimiters if and only if it contains the id I am looking for. For example suppose I have a text file containing several entries like the following:

 /id:12345, comment:"test @#$%7 *<", date:JUN-06-21/;/comment:"@#rehj%fh^?*<", date:MAR-15-20, id:11333/;/date:AUG-22-18, id:44618, comment:"&%$@#^?*!!/;

Let's say I want to match an entry that has the ID 44618 using REGEX. What makes this difficult is that ID can appear at the beginning, in the middle, or in the last position. The following is the REGEX I have so far but it's not working.

    \/\w[a-zA-Z0-9,:]*?\s?(id:44618,)\s?(\/\w[a-zA-Z0-9:;/])*

CodePudding user response:

If you need only specific ID, just write in regex id and set flag "g"-global and "m" multiline yourString.match(/id:747484/gm) And for different ids yourString.match(/id:\d /gm)

CodePudding user response:

Used regex to get whole entry where id can be in any position of the entry:

/\/[^;]*?\bid:44618\D[^;]*?;/

Regex in context and testbench:

const input = '/id:12345, comment:"test @#$%7 *<", date:JUN-06-21/;'   
              '/comment:"@#rehj%fh^?*<", date:MAR-15-20, id:11333/;'  
              '/date:AUG-22-18, id:44618, comment:"&%$@#^?*!!/;';

const regex = /\/[^;]*?\bid:44618\D[^;]*?;/;           

alert(input.match(regex)[0]);

Output from alert, printing the whole entry:

/date:AUG-22-18, id:44618, comment:"&%$@#^?*!!/;

CodePudding user response:

The pattern does not match anything because the first character class [a-zA-Z0-9,:] does not contain all the allowed characters to match until the occurrence of id:44618 in the example string.

You could extend it for example like this but that will not match until the delimiter /;

Another think to note is that if the id is at the start like /id:44618 then matching the first word character like \/\w in your pattern will prevent from matching the id at all as the first character is already consumed.

You could extend the character class at the end of the pattern with all the allowed characters, but as it can also be a comment, you don't know the possible characters up front.

If the comment field is not before the id field, what you might do is assert the word character after the opening / using a positive lookahead (?=\w) and then match the id and match as least as possible chars .*? until the closing /;

\/(?=\w)[a-zA-Z0-9,:;\s-]*?(id:44618).*?\/;

Regex demo

If there are no other / allowed, you might also use a negated character class [^/]* matching any char except the forward slash

\/[^/]*(id:44618)[^/]*\/;

Regex demo

CodePudding user response:

Use

/.*id:(\d )/

See regex proof.

EXPLANATION

--------------------------------------------------------------------------------
  .*                       any character except \n (0 or more times
                           (matching the most amount possible))
--------------------------------------------------------------------------------
  id:                      'id:'
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    \d                       digits (0-9) (1 or more times (matching
                             the most amount possible))
--------------------------------------------------------------------------------
  )                        end of \1

JavaScript code:

const string = `/id:12345, comment:"test @#$%7 *<", date:JUN-06-21/;/comment:"@#rehj%fh^?*<", date:MAR-15-20, id:11333/;/date:AUG-22-18, id:44618, comment:"&%$@#^?*!!/;`
console.log((string.match(/.*id:(\d )/) || ['',''])[1])

  • Related