I am an absolute beginner at internet programming. I tried to do some http request manually. I connected to youtube with telnet
and tried to get html file on the main page:
ivan@LAPTOP-JSSQ9B0M:/mnt/d/PROJECTS$ telnet www.youtube.com 80
Trying 173.194.222.198...
Connected to wide-youtube.l.google.com.
Escape character is '^]'.
GET / HTTP/1.1
Host: www.youtube.com
But for some reason the response is:
HTTP/1.1 301 Moved Permanently
Content-Type: application/binary
X-Content-Type-Options: nosniff
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Pragma: no-cache
Expires: Mon, 01 Jan 1990 00:00:00 GMT
Date: Thu, 01 Dec 2022 18:13:19 GMT
Location: https://www.youtube.com/
Server: ESF
Content-Length: 0
X-XSS-Protection: 0
X-Frame-Options: SAMEORIGIN
As far as i know 301 error means that the resource i am trying to get has been moved to other place and the new address is placed in response in "location" header. But as you can see, address i am trying to connect and in location header are the same. When i try to do it with other sites (for example github.com, www.youtube.com, www.twitch.tv), i get the same thing - 301 Moved Permanently but the address in location header and mine are the same.
Question: Why does it happen and how to fix it?
CodePudding user response:
Good on you for being curious and trying stuff on your own! What you are doing with "telnet" isn't different at all than what your browser is doing when connecting to youtube.com. (well, there is HTTP 2/3 and QUIC, but that's a topic for another day).
It's establishing a raw TCP Connection to port 80 on youtube, and sending an HTTP request. Now, you can "type it in" since HTTP is a plain-text protocol. (Those are pretty rare and are getting even more scarce)
Anyway, Youtube runs an HTTP Server on port 80, the default port for HTTP, (i.e. - when browsers go to youtube.com they set up a connection to port 80 and then try sending an HTTP GET Request) serving plain HTTP.
However! HTTP is unencrypted, (note! it has nothing to do with it being a textual protocol), so, regardless of your request parameters (uri, user-agent, etc), that HTTP Server chooses (well, "was configured") to always redirects you to https://www.youtube.com, via the location header. (or in twitch's case to https://www.twitch.tv)
but the address in location header and mine are the same
notice the https?
Location: https://www.youtube.com/
that tells your browser to go to that website instead. Via HTTPS - which is HTTP over TLS. And browsers know that the default port for HTTPS is 443.
You "telnet" client isn't actually an http-client, and it doesn't automatically react to that "instruction".
But let's say you, the "client" in this case, do honor that redirection.
That "over TLS" part? that's tricky. you can't "type it in" yourself with plain-text letters. it's a pretty complex process, and "binary" (i.e. not in English) almost in it's entirety, depending on the version of TLS.
So you can't "fix it", it's a security feature - their (youtube's and twitch's) policy is (rightfully so) - Never serve content over HTTP, so that bad-guys who may be "snooping" in the middle can't observe / modify the requests / responses.
You can try with other websites that don't behave like that for example, example.com
if you want to "programmatically" connect to HTTPS servers, you can do that with any cli-http client, like wget
, curl
, or invoke-webrequest with windows powershell,
or, with almost any programming language - like with the requests module in python, the "fetch" api in JS, the Java HttpClient, and so on. or, if you're feeling particularly cool- use a TLS wrapper and send the HTTP request over that.
Does that make sense?
feel free to drop any further questions below!