Home > Software design >  Using user's input as a host in a Python-based primitive "browser"
Using user's input as a host in a Python-based primitive "browser"

Time:10-24

I have a trivial question. How can I use user's input as an URL in a Python app with socket libraries?

I have this code:

import socket

mysock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
mysock.connect(('data.pr4e.org', 80))
cmd = 'GET http://data.pr4e.org/romeo.txt HTTP/1.0\r\n\r\n'.encode()
mysock.send(cmd)

while True:
    data = mysock.recv(512)
    if len(data) < 1:
        break
    print(data.decode(),end='')

mysock.close()

And now I'm supposed to change it so that the app opens an URL provided by the user, not the one which is now hard-coded. I've tried many things, all of 'em fail with a different error message.

The things I tried: 1)

url = input("Enter URL: ")
[...]
cmd = 'GET url HTTP/1.0\r\n\r\n'.encode()
mysock.send(cmd)

Then I tried separating my url variable from the rest of the line with commas:

cmd = 'GET', url, 'HTTP/1.0\r\n\r\n'.encode()

I also tried to be creative:

url = input("Enter URL: ")
realurl = "GET"   " "   url   " "   "HTTP/1.0\r\n\r\n'.encode()"
[...]
cmd = realurl
mysock.send(cmd)

For the sake of simplicity, let's now skip the problem with mysock.connect's content - I was trying to launch the app with a link to page on the same host.

CodePudding user response:

cmd = 'GET http://data.pr4e.org/romeo.txt HTTP/1.0\r\n\r\n'.encode()

should be

cmd = 'GET /romeo.txt HTTP/1.0\r\n\r\n'.encode()

domain shouldnt be included in the path...add a host header which has http://data.pr4e.org/ as the value....

CodePudding user response:

If you had printed the cmd then your issue would be rather clear. Formatting was incorrect and .encode()needs to be applied to the entire concatenated string!

  1. cmd = 'GET url HTTP/1.0\r\n\r\n'.encode() is a bytes like obj that doesn't include the actual url

  2. cmd = 'GET', url, 'HTTP/1.0\r\n\r\n'.encode() is a tuple (str, str, bytes)

  3. realurl = "GET" " " url " " "HTTP/1.0\r\n\r\n'.encode()" is a string that contains a string .encode() rather than applying it to the str.

url = input("Enter Url: ")`
cmd = 'GET '   url   ' HTTP/1.0\r\n\r\n'
# Encode encodes a string to a bites like obj.
# Therefore we must apply it to the entire string!
mysock.send(cmd.encode())
  • Related