How does cloudlfare (or other CDN) detect direct IP address requests?-CodePudding

So I did a DNS lookup of a website hosted through cloudflare.

I pasted the IP address in my address bar and got a page saying:

Error 1003 Ray ID: 729ca4f4aff82e38 • 2022-07-12 20:49:14 UTC Direct IP access not allowed

If my browser is doing the same thing i.e. fetching the ip using the url then sending a HTTPS req to the same IP, but when I do it manually I am getting this error - how can cloudflare detect that its a direct IP access attempt?

CodePudding user response：

One possible way this could be done is by matching the DNS requests with the HTTP page requests.

Lets say I have the tld in my own domain server (some sorcery to ensure the DNS requests always flows to my own name servers) and then match these requests with the HTTP (ip address requests) , maybe it matches them using some combo of source IP, some device signature like browser,os version.

When I am doing a direct IP request there will be no corresponding matching DNS request from the same device and hence the request will get rejected.

CodePudding user response：

how can cloudflare detect that its a direct IP access attempt?

Just based on what you (your browser) are sending!

An URL is of the form http://hostname/path considering that hostname can be an IP address.

When you put that in your browser, the browser will split the parts and do an HTTP query.

The HTTP protocol defines an HTTP message to be headers plus an optional body. Among headers, one is called host and the value is exactly what was in URL.

Said differently, between http://www.example.com/ and http://192.0.2.42/ (if www.example.com was resolving to that IP address):

at the TCP/IP level nothing changes: in both cases, through OS, the browser connects at IP address 192.0.2.42 (because the www.example.com from first URL will be resolved to its IP address)
when it starts the HTTP exchange, the message sent by the client will then have as header either host: www.example.com in the first case or host: 192.0.2.42 in the second case
the webserver sees obviously all headers sent by client, including this host one and hence can do whatever it wants with it, and most importantly select which website was requested if multiple websites resolves to the same IP address (if you understand the text above, you now see why the host header is necessary). If URLs are https:// and not just http:// there is a subtetly because there is another layer between the TCP/IP connection and the HTTP application protocol, which is TLS, and the equivalent of the host header is sent also at the TLS level, through what is called the SNI extension, so that the server can also decide which server certificate it needs to send back to the client, before even the first byte of the HTTP exchange is done.