Home > Enterprise >  regex to get all slashes from url
regex to get all slashes from url

Time:10-22

I have the following URL:

localhost:3000/filter/shoes/color/white

I need to replace all slashes to - except the first slash from localhost:3000/.

The final URL must be:

localhost:3000/filter-shoes-color-white

I've tried some regex with ruby but I didn't have any success. Thanks.

CodePudding user response:

Here is a regexp that match all the / but the first:

\G(?:\A[^\/]*\/)? [^\/]*\K\/

So you can do:

"localhost:3000/filter/shoes/color/white".gsub(/\G(?:\A[^\/]*\/)? [^\/]*\K\//,'-')
#=> "localhost:3000/filter-shoes-color-white"

But it'll not work if you have a scheme on your URI.

CodePudding user response:

You can match the regular expression

r = /\G\A[^\/]*\/[^\/]*\K\/|\//
str = "localhost:3000/filter/shoes/color/white"
str.gsub(r, '-')
  #=> "localhost:3000/filter-shoes-color-white"

Rubular demo / PCRE demo

I've included the link to the PCRE demo at regex101.com, as it gives the same result as Ruby's regex engine (Onigmo), but it shows--by hovering the cursor over the regex--the function of each element of the expression.

We can write the expression in free-spacing mode to make it self-documenting:

/
\G      # assert position at the end of the previous match or, if the
        # first match, the start of the string
\A      # match the beginning of the string
[^\/]*  # match zero or more chars other than '/'
\/      # match '/' 
[^\/]*  # match zero or more chars other than '/'
\K      # reset the start of the match to the current position and discard
        # all previously-consumed characters from the reported match
\/      # match '/' 
|       # or
\/      # match '/' 
/x      # free-spacing regex definition mode

Here is a second way that reverses the string, makes the replacements then reversing the resulting string.

r = /\/(?=.*\/)/
str.reverse.gsub(r,'-').reverse
  #=> "localhost:3000/filter-shoes-color-white"

This works because while Ruby does not support variable-length lookbehinds it does support variable-length lookaheads.

CodePudding user response:

TL;DR:

regex is:

\/(?<!localhost:3000\/)

Longer one

A famous old Chinese saying is: Teaching how to fishing is better than giving you the fish.

  1. For regex, you can use online regex site such as regex101.com to test immediately with your regex and test string. link
  2. Found other answers from stackoverflow using other key words to describe your situation: Regex for matching something if it is not preceded by something else
  3. Make you own magic.

CodePudding user response:

This is a pretty simple parsing problem, so I question the need for a regular expression. I think the code would probably be easier to understand and maintain if you just iterated through the characters of the string with a loop like this:

def transform(url)
  url = url.dup
  slash_count = 0
  (0...url.size).each do |i|
    if url[i] == '/'
      slash_count  = 1
      url[i] = '-' if slash_count >= 2
    end
  end
  url
end

Here is something even simpler using Ruby's String#gsub method:

def transform2(url)
  slash_count = 0
  url.gsub('/') do
    slash_count  = 1
    slash_count >= 2 ? '-' : '/'
  end
end

CodePudding user response:

Using Ruby >= 2.7 with String#partition

Provided you aren't passing in a URI scheme like 'https://' as part of your string, you can do this as a single method chain with String#partition and String#tr. Using Ruby 3.0.2

'localhost:3000/filter-shoes-color-white'.partition(?/).
    map { _1.match?(/^\/$/) ? _1 : _1.tr(?/, ?-) }.join
#=> "localhost:3000/filter-shoes-color-white"

This basically relies on the fact that there are no forward slashes in the first array element returned by #partition, and the second element contains a slash and nothing else. You are then free to use #tr to replace forward slashes with dashes in the final element.

If you have an older Ruby, you'll need a different solution since String#partition wasn't introduced before Ruby 2.6.1. If you don't like using character literals, ternary operators, or numbered block arguments (introduced in Ruby 2.7), then you can refactor the solution to suit your own stylistic tastes.

  •  Tags:  
  • ruby
  • Related