Home > Net >  404 Not found Error while using Socket library in Python3
404 Not found Error while using Socket library in Python3

Time:11-04

I was trying to run the following sample code in Python 3.8:

import socket
mysock = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
mysock.connect(('www.py4inf.com',80))

# since I need to send bytes and not a str. I add a 'b' literal also tried with encode()
mysock.send(b'GET http://www.py4inf.com/code/romeo.txt HTTP/1.0\n\n')
while True:
    data = mysock.recv(512)
    if (len(data)) < 1:
        break
    print(data)
mysock.close()

but this is throwing 404 Not found error

b'HTTP/1.1 404 Not Found\r\nServer: nginx\r\nDate: Tue, 02 Nov 2021 04:38:35 GMT\r\nContent-Type: text/html\r\nContent-Length: 146\r\nConnection: close\r\n\r\n<html>\r\n<head><title>404 Not Found</title></head>\r\n<body>\r\n<center><h1>404 Not Found</h1></center>\r\n<hr><center>nginx</center>\r\n</body>\r\n</html>\r\n' 

same I tried with urllib, It is working fine

import urllib.request
output= urllib.request.urlopen('http://www.py4inf.com/code/romeo.txt')
for line in output:
    print(line.strip())

Does anybody know how to fix this? help me to figure out where am I going wrong in first block code.. Thanks in Advance!

CodePudding user response:

mysock.send(b'GET http://www.py4inf.com/code/romeo.txt HTTP/1.0\n\n')

This is not a valid HTTP request for several reasons:

  • it should contain the absolute path, not the absolute URL
  • it should contain a Host header. Even though this is not strictly required for HTTP/1.0 (but for HTTP/1.1) it is usually expected anyway
  • the line endings must be \r\n not \n

The following works:

mysock.send(b'GET /code/romeo.txt HTTP/1.0\r\nHost: www.py4inf.com\r\n\r\n')

In general: while HTTP might look simple it is not. Don't just assume how things are working by looking at a bit of traffic, but follow the actual standard. Even if it seems to be working initially for specific servers it might break later.

  • Related