So I did a DNS lookup of a website hosted through cloudflare.
I pasted the IP address in my address bar and got a page saying:
Error 1003 Ray ID: 729ca4f4aff82e38 • 2022-07-12 20:49:14 UTC Direct IP access not allowed
If my browser is doing the same thing i.e. fetching the ip using the url then sending a HTTPS req to the same IP, but when I do it manually I am getting this error - how can cloudflare detect that its a direct IP access attempt?
CodePudding user response:
One possible way this could be done is by matching the DNS requests with the HTTP page requests.
Lets say I have the tld in my own domain server (some sorcery to ensure the DNS requests always flows to my own name servers) and then match these requests with the HTTP (ip address requests) , maybe it matches them using some combo of source IP, some device signature like browser,os version.
When I am doing a direct IP request there will be no corresponding matching DNS request from the same device and hence the request will get rejected.
CodePudding user response:
how can cloudflare detect that its a direct IP access attempt?
Just based on what you (your browser) are sending!
An URL is of the form http://hostname/path
considering that hostname can be an IP address.
When you put that in your browser, the browser will split the parts and do an HTTP query.
The HTTP protocol defines an HTTP message to be headers plus an optional body. Among headers, one is called host
and the value is exactly what was in URL.
Said differently, between http://www.example.com/
and http://192.0.2.42/
(if www.example.com was resolving to that IP address):
- at the TCP/IP level nothing changes: in both cases, through OS, the browser connects at IP address
192.0.2.42
(because thewww.example.com
from first URL will be resolved to its IP address) - when it starts the HTTP exchange, the message sent by the client will then have as header either
host: www.example.com
in the first case orhost: 192.0.2.42
in the second case - the webserver sees obviously all headers sent by client, including this
host
one and hence can do whatever it wants with it, and most importantly select which website was requested if multiple websites resolves to the same IP address (if you understand the text above, you now see why thehost
header is necessary). If URLs arehttps://
and not justhttp://
there is a subtetly because there is another layer between the TCP/IP connection and the HTTP application protocol, which is TLS, and the equivalent of thehost
header is sent also at the TLS level, through what is called the SNI extension, so that the server can also decide which server certificate it needs to send back to the client, before even the first byte of the HTTP exchange is done.