Home > Mobile >  www.google.com returns HTTP 301
www.google.com returns HTTP 301

Time:07-03

I'm looking at this example of making a simple HTTP request in Python using only the built in socket module:

import socket

target_host = "www.google.com"
target_port = 80


client = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
client.connect((target_host, target_port))
client.send(b"GET / HTTP/1.1\r\nHost: google.com\r\n\r\n")
response = client.recv(4096) 
client.close()
print(response)

When I run this code, I get back a 301:

<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
<A HREF="http://www.google.com/">here</A>.
</BODY></HTML>

I'm confused by this because the "new location" looks identical to the URL I requested. Using curl or wget on the same URL (www.google.com) returns a 200. I'm having a hard time understanding what is different. Are curl/wget getting the same 301 "behind the scenes" and just automatically requesting the 'new' resource? And if so how is that possible given that, as mentioned above, the 'new' location appears identical to the original?

CodePudding user response:

I'm confused by this because the "new location" looks identical to the URL I requested

It doesn't. Your host header says that you are accessing google.com, i.e. without www:

client.send(b"GET / HTTP/1.1\r\nHost: google.com\r\n\r\n")

This gets redirected to www.google.com, i.e. with www:

<A HREF="http://www.google.com/">here</A>.
  • Related