Home > OS >  htaccess redirect with negative lookahead
htaccess redirect with negative lookahead

Time:10-13

I have a WordPress blog that I need to migrate to another website in which we will use APIs to get data from the WordPress back-end. Since the WordPress website receives a lot of visits every month, I need to create redirects from WordPress to the new website.

Old URL structure

https://myblog.com/category/alias-of-the-article

New URL structure

https://mynewwebsite.com/blog/alias-of-the-article

I was thinking of having something like:

RedirectMatch 301 "/(.*)/(.*)" "https://mynewwebsite.com/blog/$2"

But I still need the APIs, the images, and everything that is under the "wp-content" folder to remain on the myblog.com website because I will load those resources from the API.

Is creating Redirect for every single category the only way to achieve this?

"/category1/(.*)" "https://mynewwebsite.com/blog/$1"
"/category2/(.*)" "https://mynewwebsite.com/blog/$1"
...
"/category20/(.*)" "https://mynewwebsite.com/blog/$1"

CodePudding user response:

If the old and new domains point to the same place then you'll likely need to use mod_rewrite (RewriteRule / RewriteCond) as opposed to mod_alias (RedirectMatch) to check the Host header in order to avoid redirecting URLs at the new domain.

It's also advisable to not mix redirects from both modules in order to avoid unexpected conflicts (mod_rewrite runs first, despite the apparent order of directives in the config file).

With mod_rewrite you can use conditions (RewriteCond directives) to create exceptions, without having to use negative lookaheads in the regex (which can be more complex if you need to make many exceptions).

Try the following instead at the top of the root .htaccess file, before any existing WordPress directives (ie. before the # BEGIN WordPress section).

For example:

RewriteCond %{HTTP_HOST} ^(www\.)?oldwebsite\.example [NC]
RewriteCond %{REQUEST_URI} !^/wp-content/
RewriteCond %{REQUEST_URI} !^/wp-json/
RewriteCond %{REQUEST_URI} !^/feed/
RewriteRule ^[^/.] /([^/.] )$ https://newwebsite.example/blog/$1 [R=302,L]

The ! prefix on the CondPattern (eg. !^/wp-content/) negates the regex, so it is successful when it does not match.

You only need a parenthesised subpattern in the regex if you need to use the backreference later. So, in the above regex, there doesn't seem to be a need to capture the catgeory.

NB: Test first with 302 (temporary) redirects to avoid potential caching issues. Only change to a 301 (permanent) redirect once you have confirmed it works as intended.

RedirectMatch 301 "/(.*)/(.*)" "https://mynewwebsite.com/blog/$2"

A problem with this regex is that it matches too much. The * quantifier is greedy by default, so given a URL of the form /foo/bar/baz, it would redirect to /blog/baz. See this recent question on the pitfalls of the greedy regex: Unexpected behavior of a regex

  • Related