blacklist some domains from requests sessions in python-CodePudding

I am using request sessions and I want some of the domains to not form a session pool. Is there any way to do it. Also is following the correct way to do it?

blacklist = ["google.com"]
blacklisted = False

def somefunction(imageurl):
    for domain in blacklist:
        if image_url.find(domain)!=-1:
            response = requests.get(image_url) 
            blacklisted = True
            break

    if blacklisted == False:
        response = session.get(image_url)

CodePudding user response：

Since you are working with a session you could define an Adapter for all your blacklisted urls something like this:

from requests.adapters import BaseAdapter
from requests.sessions import Session


blacklist = [
    'https://google.com',
    'http://gitlab.com',
] # note, the url scheme is needed


class BlacklistedDomainError(Exception):
    def __init__(message=None):
        if message is None:
            message = 'Blacklisted domain'
        super(BlacklistedDomainError, self).__init__(message)


class BlacklistAdapter(BaseAdapter):
    def send(
     self,
     request,
     stream=False,
     timeout=None,
     verify=True,
     cert=None,
     proxies=None
):
    raise BlacklistedDomainError()


session = Session()


for prefix in blacklist:
    session.mount(prefix, BlakclistAdapter)


def somefunction(imageurl):
    try:
        response = session.get(imageurl)
    except BlacklistedDomainError:
        return None
    return response

I included a custom exception as it is best practice but you could raise any exception.

Be careful I used domain here as it is the term you used, but you're really mounting the adapter for a prefix (the session basically checks if the url it tries to reach starts with the string, see here), that's why the url scheme is important.

See the requests doc for more info on Adapters.

CodePudding user response：

blacklist = ["google.com"]
blacklisted = False

def somefunction(image_url): #make sure that argument is the same as in next line
    if image_url in blacklist:#check if the url in blacklist without for loop      
          response = requests.get(image_url) 
          blacklisted = True
          break

    if blacklisted == False:
        response = session.get(image_url)