Home > Software design >  How to match repeating patterns containing special characters in Ruby using regex?
How to match repeating patterns containing special characters in Ruby using regex?

Time:10-21

Basically, I am trying to use regex to match repeating patterns containing special characters in Ruby. I've been able to do this if I am given the number of times a pattern repeats but not dynamically. An example string I am looking to match is:

Draw a square that is {{coords.width}} pixels wide by {{coords.height}} pixels tall.

This can be easily done by using

arr = value.scan(/\{\{(\w ?\.\w ?)\}\}/).flatten

arr looks like this after I run this

["coords.width", "coords.height"]

But how do I write a regex which can match in case this pattern follows arbitrarily, for example

Draw a square that is {{shape.rectangle.coords.width}} pixels wide by {{shape.rectangle.coords.height}} pixels tall.

while also matching in case of the following(no ".")

Draw a square that is {{width}} pixels wide by {{height}} pixels tall.

CodePudding user response:

You can match the regular expression

r = /(?<=\{\{)[a-z] (?:\.[a-z] )*(?=\}\})/

Rubular demo / PCRE demo at regex 101.com

I've included the PCRE demo because regex101.com provides a detailed explanation of each element of the regex (hover the cursor).

For example,

str = "Draw a square {{coords.width}} wide by {{coords.height}} "  
      "tall by {{coords deep}} deep"
str.scan(r)
  #=> ["coords.width", "coords.height"]

Notice that "coords deep" does not match because it does not have (what I have assumed is) a valid form. Notice also that I did not have to flatten the return value from scan because the regex has no capture groups.

We can write the regular expression in free-spacing mode to make it self-documenting.

/
(?<=      # begin a positive lookbehind
  \{\{    # match 1 or more lower case letters
)         # end the positive lookbehind
[a-z]     # match 1 or more lower case letters
(?:       # begin a non-capture group
  \.      # match a period
  [a-z]   # match 1 or more lower case letters
)         # end the non-capture group
*         # execute the non-capture group zero or more times
(?=       # begin a positive lookahead
  \}\}    # match '}}'
)         # end positive lookahead
/x        # free-spacing regex definition mode

CodePudding user response:

(/\{\{(.*?)\}\}/)

This did the trick. It matches anything that's within {{}} but I can always validate the structure upon extracting the occurrences/patterns

CodePudding user response:

(\{ \S )

The pattern above would achieve your goals. It matches all non space characters within the enclosure.

  • Related