Home > database >  How to loop through list of URL and download specific data from them in python BeautifulSou
How to loop through list of URL and download specific data from them in python BeautifulSou

Time:12-02

here I have a list of URL and I am trying to get "td" from all of them, but am only able to get the last URL's HTML.

import numpy as np
import pandas as pd
from datetime import datetime
import pytz
import requests
import json
from bs4 import BeautifulSoup


url_list = ['https://www.coingecko.com/en/coins/ethereum/historical_data/usd?start_date=2021-08-06&end_date=2021-09-05#panel',
            'https://www.coingecko.com/en/coins/cardano/historical_data/usd?start_date=2021-08-06&end_date=2021-09-05#panel',
            'https://www.coingecko.com/en/coins/chainlink/historical_data/usd?start_date=2021-08-06&end_date=2021-09-05#panel']


for link in range(len(url_list)):

    response = requests.get(url_list[link])
    src = response.content
    soup = BeautifulSoup(response.text , 'html.parser')

res1 = soup.find_all( "td", class_ = "text-center")
res1

could anyone please help me how to get data of all URLs ?

CodePudding user response:

You are overwriting your soup variable through each iteration the loop. So instead of saving all the results from each url and then looping over those, you are only going to get the final result.

  1. Create a variable before the loop to store the results of each iteration
  2. Append the soup to that new variable each iteration
  3. create a new loop to interact with your stored data

and you can access each element in a list with:

for url in url_list:
    response = requests.get(url)
    # rest of code
    

Easier to read

So

# empty list to store all results
results = []

# your loop here
for u in url_list:
    response = requests.get(url)
    src = response.content
    soup = BeautifulSoup(response.text , 'html.parser')
    results.append(soup.find_all( "td", class_ = "text-center"))

# Accessing the data from the results
for result in results:
    print(result)
  • Related