I just web scrap the links from the webpage. I need to loop all the URL for that I want a single URL on single index but after I put the URLS in list format and try to get the single URL instead of complete URL I getting single alphabet of every URL in the list. Here is the code:
import pandas as pd
from bs4 import BeautifulSoup
import requests
from csv import writer
url = 'https://www.pwcs.edu/index'
req = requests.get(url)
soup = BeautifulSoup(req.text, 'lxml')
lists = soup.find('div', class_='school-dropdown-col')
for link in lists.find_all('a', href=True):
x = list([link['href']])
print(x[0][3])
CodePudding user response:
You should append all the links to some list then and afterwards you are able to access any index in list. In my example I am printing first url in the list:
import pandas as pd
from bs4 import BeautifulSoup
import requests
from csv import writer
url = 'https://www.pwcs.edu/index'
req = requests.get(url)
soup = BeautifulSoup(req.text, 'lxml')
lists = soup.find('div', class_='school-dropdown-col')
all_links = []
for link in lists.find_all('a', href=True):
all_links.append(link['href'])
print(all_links[0])