How do I get these outputs in 1 single list or dictionary-CodePudding

from bs4 import BeautifulSoup
import requests

with open("htmlviewer.html") as fp:
    soup = BeautifulSoup(fp, "html.parser")
    gp = soup.find_all("a")
    
for link in gp:
    bs = link.get('href')


I am using this code to extract links from source code and here is my output -|

None

https://support.google.com/websearch/answer/181196?hl=en-IN

None

https://www.google.com/webhp?hl=en&ictx=2&sa=X&ved=0ahUKEwj88YTzkL_7AhX9TGwGHZQpBVEQPQgJ

https://chromedriver.chromium.org/

/search?rlz=1C1CHBD_enIN1032IN1032&sxsrf=ALiCzsZzV82nGh7PsFzltlGMqVaKe-JR2Q:1669028827453&q=What is a Chrome WebDriver?&sa=X&ved=2ahUKEwj88YTzkL_7AhX9TGwGHZQpBVEQzmd6BAgUEAUhttps://www.selenium.dev/documentation/webdriver/getting_started/install_drivers/

https://splinter.readthedocs.io/en/latest/drivers/chrome.html

https://subscription.packtpub.com/book/web-development/9781784392512/1/ch01lvl1sec15/setting-up-chromedriver-for-google-chrome

I want all the links in 1 single list or dictionary

if I do this

bs = {link.get('href')}

I am getting every single link in new dictionary can anyone help, I am new at coding,

Also how do I select links starting with https and ignore /search, I know very stupid questions but I am week into learning python.

CodePudding user response：

First create an empty list outside the for loop, e.g. links = [] and then inside your for loop do links.append(link.get("href"))