Home > Back-end >  How do I scrape the data for each personal links listed in a webpage using python?
How do I scrape the data for each personal links listed in a webpage using python?

Time:01-09

I am about to get the details of each lawyer in https://chambers.com/all-lawyers-asia-pacific-8. There are about 5k lawyers listed. But their details are listed in the associated links in the site. I don't have a problem scraping a single web page. However, it will take forever for me to visit each lawyer profile page and scrape them individually. Is there a way to loop this process?

I really don't know what to do because I was tasked to get the lawyer's name, the link to their profile, their law firm, and their ranks.

CodePudding user response:

I recommend you to use threading to boost up the process. It is possible that the site will ban you for too many requests. In that case, you should use a different user agent for each thread or make requests via tor or vpn.

CodePudding user response:

Use selenium webdriver. With find_elements(by.xpath) get all href attributes of all records as a list and loop through them in a for each. In a loop use href value to open details page webdriver.get(href_value). That will open page with record details where you scrape needed info.

  • Related