I have a strange problem with internal mod_rewrte redirects on Apache 2.4.
In my .htaccess
file I redirect a subdomain sub
to a folder /sub
with the following directives:
RewriteCond %{HTTP_HOST} ^sub.mydomain.com$ [NC]
RewriteRule ^((?!sub).*)$ /sub/$1 [NC]
Thos works perfectly for, say, https://sub.mydomain.com/articles/
- the URL stays like this in the browser's address field and as expected the data from the location at /sub/articles/index.html
are served/
However when I type in in the browser https://sub.mydomain.com/articles
(note the missing slash) the URL is changed in the browser to https://sub.mydomain.com/sub/articles/
(note the duplicated sub
as folder and subdomain!).
I guess this is caused by Apache's default behavior to add a slash to slashless directory requests as external redirects. The adding of the slash is OK with me, but of course I want to avoid the folder-subdomain duplication. - How can I do this?
CodePudding user response:
Yes, this is caused by mod_dir appending a slash (with a 301 redirect) to the directory after the rewrite has occurred, exposing the internally rewritten URL/directory.
The canonical URL therefore needs to be /articles/
(with a trailing slash), not /articles
. We can correct this with an external redirect before the rewrite occurs.
(This avoids you having to disable the DirectorySlash
- which would still leave you with a canonicalization / duplicate content issue.)
For example, before the existing rewrite, test to see if the requested URL-path (that is missing a trailing slash) exists as a directory in the /sub
directory and append a slash if that is the case.
# Redirect to append trailing slash if exists as a dir inside "/sub"
RewriteCond %{HTTP_HOST} ^sub\.mydomain\.com [NC]
RewriteCond %{DOCUMENT_ROOT}/sub/$1 -d
RewriteRule ^((?!sub/).*[^/])$ /$1/ [R=301,L]
As an additional optimisation, you can avoid unnecessarily performing a filesystem check (which are relatively expensive) on static assets (that naturally do not end in a trailing slash) by excluding URLs that look like they have a file-extension. (This assumes you don't have physical directories that have, what looks like, a file extension, eg. /sub/somedir.xyz
)
Add the following as the 2nd condition (before the filesystem check) in the above rule:
RewriteCond %{REQUEST_URI} !\.\w{2,4}$
Aside:
RewriteCond %{HTTP_HOST} ^sub.mydomain.com$ [NC] RewriteRule ^((?!sub).*)$ /sub/$1 [NC]
You should probably be using the L
flag on this RewriteRule
directive. (And the NC
flag should be unnecessary.)
The regex ^((?!sub).*)$
excludes any URL-path that simply starts sub
, which would include /subfoo
and /subbar
, etc. (which naturally prevents these directories from being accessible in the /sub
directory). Any valid request would start /sub/
(with a trailing slash), so should be included in the negative lookahead, as I did in the rule above.
If not already, consider also redirecting to remove /sub/
from direct requests if this directory should be exposed/discovered.