Home > OS >  How to exclude brackets at the end of the Url
How to exclude brackets at the end of the Url

Time:09-21

I am new to regex, so any help is really appreciated. I have an expression to identify a URL : (http[^'\"] )

Unfortunately on some URLs, I get additional square brackets at the end For instance "http://example.com]]"

As the result want to receive "http://example.com"

How do I get rid of those brackets with the help of the regex I wrote above?

CodePudding user response:

What you actually have is called a negated character class, so just add characters that should not be matched. In addition, there's not really a need for a capturing group. That said, you could use

http[^'"\]\[] 
#       ^^^^

Note that this will exclude square brackets anywhere in your possible url not just at the end. See a demo on regex101.com.

CodePudding user response:

Stop the match between a word and nonword character:

(http[^'"] )\b

See regex proof.

EXPLANATION

--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    http                     'http'
--------------------------------------------------------------------------------
    [^'"]                    any character except: ''', '"' (1 or
                             more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
  \b                       the boundary between a word char (\w) and
                           something that is not a word char
  • Related