I need to make a checker for availability of an unlimited number of sites, could you please tell me how to implement a list of URLs so the script reads the URLs from the .txt and loops each of them?
My approximate non-working code:
import requests
urls_list = open('G:\\urls_list.txt', 'r ')
for url in urls_list:
response = requests.get(url)
if response.status_code != 200:
print('Not active'.format(url))
CodePudding user response:
It is because, when doing for url in urls_list: ...
, the url
string will contain a newline character \n
at the end.
You need to use str.rstrip
to remove them.
Also, you better use a context-manager when reading files (with open("urls.txt") as f
), it handles closing the file for you.
import requests
with open("urls.txt") as f:
for url in map(str.rstrip, f):
print("-" * 34)
print(f"{url = }")
try:
response = requests.get(url)
except requests.ConnectionError as e:
print(e)
continue
print(f"{response.status_code = }")
----------------------------------
url = 'https://github.com/'
response.status_code = 200
----------------------------------
url = 'https://githubb.com/'
HTTPSConnectionPool(host='githubb.com', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f26a6758f70>: Failed to establish a new connection: [Errno -2] Name or service not known'))
----------------------------------
url = 'https://www.gnu.org/software/bash/manual/bash.html'
response.status_code = 200
----------------------------------
url = 'https://www.gnu.org/software/bash/manual/bashh.html'
response.status_code = 404
----------------------------------
url = 'https://stackoverflow.com/'
response.status_code = 200
----------------------------------
url = 'https://stackoverfloww.com/'
response.status_code = 200