On our shared hosting server, we had a request to keep client's domain on their server but redirect the DNS responsible for web traffic to our server.
On our server we had to add an entry in the .htaccess
file in our root to point it to a folder in the server:
RewriteCond %{HTTP_HOST} ^(www\.)?example\.pl$ [NC]
RewriteCond %{REQUEST_FILENAME} !/WebsitesLive/Example/
RewriteRule ^(.*)$ /WebsitesLive/Example/$1 [L]
And the website works fine but we noticed in Google Analytics that some people access the website using https://example.pl/WebsitesLive/Example
. I finally realised that (maybe) it's the HTTPS and non-www redirection in the htaccess
file of the client's site:
RewriteCond %{HTTPS} !on
RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI} [R=301,L]
Is it true that %{REQUEST_URI}
would, in this case, contain WebsitesLive/Example
in the redirection URL?
Most importantly, how do I stop it?
CodePudding user response:
Is it true that
%{REQUEST_URI}
would, in this case, containWebsitesLive/Example
in the redirection URL?
Yes, after the internal rewrite from the root the REQUEST_URI
server variable is updated to include the full URL-path. You could instead capture the URL-path in the RewriteRule
pattern and use the backreference which will be relative to the directory that contains the .htaccess
file.
For example:
RewriteCond %{HTTPS} !on
RewriteRule (.*) https://%{HTTP_HOST}/$1 [R=301,L]
(You were already capturing the URL-path, but not using it.)
Ordinarily, if this canonical redirect was used in a subdirectory and that subdirectory was part of the visible URL then this would be incorrect, since it would remove the subdirectory from the redirected URL.
Alternatively, you could implement the canonical redirects in the root .htaccess
file instead.
However, whilst this should prevent the filesystem directory being exposed in the canonical redirect, this doesn't prevent a user from accessing this subdirectory (from any domain). And since this subdirectory has already been exposed (especially as a 301 permanent redirect) then direct access to this subdirectory needs to be blocked or redirected back to the root. However, we need to be careful of redirect loops when doing so.
You can use something like the following in the subdirectory .htaccess
to redirect any "direct" requests from the user back to the root:
# /WebsitesLive/Example/.htaccess
# Redirect direct requests to subdirectory back to root
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteRule (.*) https://%{HTTP_HOST}/$1 [R=301,L]
The REDIRECT_STATUS
environment variable is empty on direct requests, but set to the HTTP status code after the first internal rewrite - thus preventing internal rewrites to the subdirectory being redirected back to the root (an endless redirect loop).
Aside:
RewriteCond %{HTTP_HOST} ^(www\.)?example\.pl$ [NC] RewriteCond %{REQUEST_FILENAME} !/WebsitesLive/Example/ RewriteRule ^(.*)$ /WebsitesLive/Example/$1 [L]
Whilst this might "work", the second condition is too broad since it is checking that /WebsitesLive/Example/
does not occur anywhere in the "filesystem path". Whereas you should be checking that /WebsitesLive/Example/
does not occur at the "start" of the URL-path. In other words, it should be like this:
:
RewriteCond %{REQUEST_URI} !^/WebsitesLive/Example/
:
Note that this condition is only necessary (to prevent a rewrite loop) if there is no .htaccess
file in the subdirectory (being rewritten to) that contains mod_rewrite directives. (Since mod_rewrite directives in the subdirectory completely override the parent - by default.)
If there is no .htaccess
file in the subdirectory then you would obviously need to prevent direct access to that subdirectory in the root .htaccess
file, but the required rule would be slightly different to the above. For example:
# /.htaccess
# Redirect direct requests to subdirectory back to root
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteRule ^WebsitesLive/Example/(.*) https://%{HTTP_HOST}/$1 [R=301,L]
# Rewrite to subdirectory
: