Home > Software engineering >  How to match string if it doesn't contain only numbers after slash?
How to match string if it doesn't contain only numbers after slash?

Time:01-02

I am redirecting certain urls with path to get variables like the following:

localhost2/post/myTitle => localhost2/post.php?title=myTitle
localhost2/post/123 => localhost2/post.php?id=123

So In my htaccess file, I use

<IfModule mod_rewrite.c>
    RewriteEngine On
    RewriteRule ^post/(\d ) post.php?id=$1
    RewriteRule ^post/(.*) post.php?title=$1
 </IfModule>

This works no problem. But I want to learn how to write negative of ^post/(\d ), that is ^post/(NEGATE-ONLY-NUMBERS). In other words I want a regex that matches the whole input sting if there is not only numbers after post/. So post/abc, post/a23, post/ab3, post/12c and post/a2c should all pass but not post/123. I refered to this post, which suggest using:

(?!^\d $)^. $ 

I can't use ^post/(?!^\d $)^. $, because there can be only one ^ and one $. I don't know what regex anchor specifies first position in a substring. My best guess is

post\/(?!\d  ).*

I think (?!\d ), with the would eat all characters followig and check if all are digits. But this fails at post/1ab.

Another guess is:

post\/(?![\d,\/] $).*

The works the best but it allows: post/3455/X. Secondly, eventually I need to convert localhost2/post/myTitle/123 => localhost2/post.php?title=myTitle&repeat=123 as well. I ave come up with the following:

^post/(?!\d ($|/))(. ?($|/))(\d $)?

Note: ? to use lazy quantifier, otherwise multiple slashes will be matched by .

and

^post/(?!\d ($|/))([^/\n\r] ($|/))(\d $)?

Here I use [^/\n\r] instead of . ?

CodePudding user response:

Patterns inside zero-width assertions like (?!\d ) are non-consuming, they do not "eat" chars, they only check the context while keeping the regex index at the same location as before matching the zero-width assertion pattern.

You can use any of the following:

^post/(?!\d (?:/|$)).*
^post/(?!\d (?=/|$)).*
^post/(?!\d (?![^/])).*

See the regex demo. Details:

  • ^post/ - start of input, post/ literal string
  • (?!\d (?=/|$)) - a negative lookahead that fails the match if, immediately to the right of the current location, there are one or more digits followed with / or end of string
  • .* - the rest of the input.

CodePudding user response:

Do not over complicate things when you can keep things simple by keeping 3 separate rewrite rules and since your query parameters are named differently you will need 3 separate rewrite rules anyway.

Consider:

Options -MultiViews
RewriteEngine On

RewriteRule ^post/(\d ) post.php?id=$1 [L,QSA,NC]

RewriteRule ^post/(.*) post.php?title=$1 [L,QSA,NC]

RewriteRule ^post/([\w-] )/(\d ) localhost2/post.php?title=$1&repeat=$2 [L,QSA,NC]

Take note of Options -MultiViews. If this is not enabled in Apache config you must have it here otherwise it will keep all $_GET parameters empty in your php file.

Option MultiViews (see http://httpd.apache.org/docs/2.4/content-negotiation.html) is used by Apache's content negotiation module that runs before mod_rewrite and makes Apache server match extensions of files. So if /file is the URL then Apache will serve /file.html.

  • Related