Home > Net >  How to decide whether to prepend www. to a URL in PHP?
How to decide whether to prepend www. to a URL in PHP?

Time:10-23

I'm writing a PHP application where the user can enter a URL and some operations take place afterwards (further details not relevant to this question).

Requirement: If the user enters example.com, it should be converted to http://www.example.com.

The http:// part is straight-forward but am struggling with the rules that determine whether www. is prepended. Since the URL could be anything that might work in a web browser, it could be localhost or 192.168.0.1 for example. For these, clearly www. shouldn't be prepended.

So the exclusion list from above is: "If the host is localhost or looks like a v4 IP address, don't prepend". But expect there will be other cases that need to covered - could anyone advise - or suggest an alternative way of approaching this?

CodePudding user response:

You can validate the user input to IP and decide whether to concatenate the "www" or not. The user input can be "127.0.0.1", "127.0.0.1:8080","http://127.0.0.1:8080' or "http://exaple.com:8080".

$input = ("127.0.0.1:8080");
[$host,$port] = explode(":",trim($input,"http://"));
if(!empty($port)){
    $port=":".$port;
}
if (filter_var($host, FILTER_VALIDATE_IP)) {
    header("location:http://$host$port");
} else {
    header("location:www.$host$port");
}

CodePudding user response:

Here is my current attempt at doing this. It makes two passes because parse_url initially puts links without a scheme such as google.com or www.google.com into the "path" part rather than the "host" part.

function preprocessAbsoluteUrl($url, $firstRun = true) {
  $parts = parse_url($url);
  $scheme = isset($parts['scheme']) ? $parts['scheme'] : 'http';
  $user = isset($parts['user']) ? $parts['user'] : '';
  $pass = isset($parts['pass']) ? ":{$parts['pass']}" : '';
  $userpass = $user !== '' || $pass !== '' ? "{$user}{$pass}@" : '';
  $host = isset($parts['host'])
    ? (preg_match('/^(?:localhost|www\.|\d{1,3}\.\d{1,3}\d{1,3}\.\d{1,3})/i',
                  $parts['host'])
      ? $parts['host']
      : "www.{$parts['host']}")
    : '';
  $port = isset($parts['port']) ? ":{$parts['port']}" : '';
  $path = isset($parts['path']) ? rtrim($parts['path'], '/') : '';
  $query = isset($parts['query']) ? "?{$parts['query']}" : '';
  $fragment = isset($parts['fragment']) ? "#{$parts['fragment']}" : '';
  $url = "{$scheme}://{$userpass}{$host}{$port}{$path}{$query}{$fragment}";
  if ($firstRun) {
    $url = preprocessAbsoluteUrl($url, false);
  }
  return $url;
}

The relevant part is the setting of $host: This currently uses a regular expression to only prepend www. when it doesn't begin with www., localhost or look like an IP address. Open to improvement suggestions!

  • Related