Home > Software design >  Sudden .htaccess parsing error: 500 Internal Server Error
Sudden .htaccess parsing error: 500 Internal Server Error

Time:06-09

Good morning and day!

Out of the blue my website (shared hosting) went down this morning with the 500 Internal Server Error. Checking the server logs, I found many copies of the following error:

[Wed Jun 08 8:44:17 2022] [core:alert] [pid 37935:tid 139960812291840] [client 172.71.94.83:22980] /usr/www/users/XXX/.htaccess: RewriteRule: cannot compile regular expression '^([A-Za-z]{2})\\/([[:alnum:]-\\/\\.] )$'

The actual line in .htaccess is:

RewriteRule ^([A-Za-z]{2})\/([[:alnum:]-\/\.] )$ $2 [L]

I removed the line and this brought the website back online, however I remember the line is needed somewhere along the way, and the website as of right now certainly malfunctions somewhere.

Does anyone know how to fix this regular expression? More interestingly, how could it have worked fine all those years?

Thanks for any tips.

CodePudding user response:

RewriteRule ^([A-Za-z]{2})\/([[:alnum:]-\/\.] )$ $2 [L]
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯^

Later versions of the PCRE engine (that is used by Apache) will break at the range specifier (hyphen) after the POSIX character class (that indicates more than 1 letter). I assume this is intended to match a literal hyphen (as it would do in earlier versions of PCRE), in which case it should be at the start or end of the character class to avoid ambiguity.

This regex can also be "simplified" in a number of ways (and personally, I would avoid the POSIX character class).

  • In the rule as stated you do not need the first capturing group. (Are there any conditions - RewriteCond directives - that precede this rule?)
  • No need to backslash-escape slashes.
  • No need to backslash-escape literal dots when used inside a character class.

Try the following instead:

RewriteRule ^[A-Za-z]{2}/([a-zA-Z0-9/.-] )$ $1 [L]

Note that having removed the first capturing group, the substitution becomes $1, not $2.

[:alnum:] is the same as [a-zA-Z0-9]. Or just use \w (word character) if you are OK with matching underscores (_) as well.

If there are preceding conditions that reference the $1 backreference then you will need to add back the first parenthesised subpattern and adjust the substitution string accordingly. ie.

RewriteRule ^([A-Za-z]{2})/([a-zA-Z0-9/.-] )$ $2 [L]
  • Related