If I understand correctly expression .ht* in the next code will match all that starts with .ht, so my .ht_lalala is safe.
<Files ".ht*">
Require all denied
</Files>
But what about next one?
(^\.ht|~$|back|BACK|backup|BACKUP$)
Is it correct for matching files: .htaccess, back, backup, BACKUP? Or next will be better instead
(^\.ht*|back*|BACK*$)
What I'd like to understand is what ~$
actually means in my code. I don't know where I saw it, but I have it in my code, and now I doubt that it's correct. Maybe it meant to be something like (^.ht|~$) just for one group.
I know basic things about regex, what is ^
and $
, and that *
means 0 or N from previous text/token, but ~
doesn't make sense inside the pattern, unless it's just a simple character and it does nothing but matches ~. I've read Apache docs, I guess for multiple matches FilesMatch and DirectoryMatch is better, however regular expressions can also be used on directives Files and Directory, with the addition of the ~ character, as is stated in the docs examples.
<Files ~ "\.(gif|jpe?g|png)$">
#...
</Files>
And well, what I want exactly is to know how to match different files or directories.
One more thing, should I escape the .
? Because default httpd.conf doesn't do so. Or it's just different for httpd.conf and .htaccess (which doesn't make sense to me)
CodePudding user response:
<Files ".ht*">
In this context, .ht*
is not a regular expression (regex). It is a "wild-card string", where ?
matches any single character, and *
matches any sequence of characters. (Whilst this is also a valid regex - a regex would match differently).
But what about next one?
(^\.ht|~$|back|BACK|backup|BACKUP$)
This is a regex (it cannot be used in the <Files>
directive as you have written above, without enabling regex pattern matching with the ~
argument - as you have used later.)
In this regex, ~$
matches any string that ends with a literal ~
(tilde character). This is sometimes used to mark backup files.
It also matches...
- Any string that starts
.ht
(which naturally includes.htaccess
). - Any string that contains
back
orBACK
orbackup
(matchingbackup
is obviously redundant). - Any string that ends with
BACKUP
.
Consequently, this does not look like it's doing quite what you think it's doing.
Or next will be better instead
(^\.ht*|back*|BACK*$)
Whilst this is a valid regex, you've obviously reverted back to a mix of "wild-card" pattern matching. Bear in mind that in regex speak, the *
quantifier matches the previous token 0 or more times. It does not match "any characters", as in wild-card pattern matching.
This still matches ".htaccess", but only because the pattern is not anchored. For example, ^\.ht*$
(with an end-of-string anchor) would not match ".htaccess".
<Files ~ "\.(gif|jpe?g|png)$">
With the Files
directive, the ~
argument enables regex pattern matching. (As you've stated.) This is quite different from when ~
is used inside the regex pattern itself.
One more thing, should I escape the
.
? Because default httpd.conf doesn't do so. Or it's just different for httpd.conf and .htaccess (which doesn't make sense to me)
I think you're mixing things up. In your first example, it's not a regex, it's a "wild-card" pattern (as stated above). In this context, the .
must not be backslash-escaped. It matches a literal .
(dot). The .
carries no special meaning here. The .
should only be escaped if you need to match a literal dot in a regular expression.
For example, the following are equivalent:
# Wild-card string match
<Files ".ht*">
and
# Regex pattern match
<Files ~ "^\.ht">
(However, it is preferable to use FilesMatch
instead of Files ~
to avoid any confusion. FilesMatch
is "newer" syntax.)
There is no difference between httpd.conf
and .htaccess
in this regard.
CodePudding user response:
When in doubt, RTFM.
~
enables regex. Without it, you just get access to wildcards ?
and *
.
As far as I know Apache uses the PCRE flavor of regex.
So once you've enabled regex via ~
then use https://regex101.com/r/lPkMHK/1 to test the behavior of the regex you've written.