Home > Enterprise >  How to make a list out of a BS4 output
How to make a list out of a BS4 output

Time:03-06

I have this code right now:

from bs4 import BeautifulSoup
import requests

get = requests.get("https://solmfers-minting-site.netlify.app/")
soup = BeautifulSoup(get.text, 'html.parser')
for i in soup.find_all('script'):
    print(i.get('src'))

And I need to somehow turn the output into a list and remove the None values from it since it outputs it like this:

jquery.js
nicepage.js
None
None
/static/js/2.c20455e8.chunk.js
/static/js/main.87864e1d.chunk.js

CodePudding user response:

Just append your extracted values to a list.

result = []
for i in soup.find_all('script'):
    elem = i.get('src')
    if elem is not None:
        result.append(elem)

Or using a list comprehension:

result = [x['src'] for x in soup.find_all('script') if x.get('src') is not None]

CodePudding user response:

Your near to your goal, but select your elements more specific and append the src to a list while iterating your ResultSet:

data = []

for i in soup.find_all('script', src=True):
    data.append(i.get('src'))
  

Alternative with css selectors:

for i in soup.select('script[src]'):
    data.append(i.get('src'))

And as allready mentioned with list comprehension:

[i.get('src') for i in soup.select('script[src]')]

Output

['jquery.js', 'nicepage.js', '/static/js/2.c20455e8.chunk.js', '/static/js/main.87864e1d.chunk.js']
  • Related