I have this code right now:
from bs4 import BeautifulSoup
import requests
get = requests.get("https://solmfers-minting-site.netlify.app/")
soup = BeautifulSoup(get.text, 'html.parser')
for i in soup.find_all('script'):
print(i.get('src'))
And I need to somehow turn the output into a list and remove the None
values from it since it outputs it like this:
jquery.js
nicepage.js
None
None
/static/js/2.c20455e8.chunk.js
/static/js/main.87864e1d.chunk.js
CodePudding user response:
Just append your extracted values to a list.
result = []
for i in soup.find_all('script'):
elem = i.get('src')
if elem is not None:
result.append(elem)
Or using a list comprehension:
result = [x['src'] for x in soup.find_all('script') if x.get('src') is not None]
CodePudding user response:
Your near to your goal, but select your elements more specific and append the src
to a list while iterating your ResultSet
:
data = []
for i in soup.find_all('script', src=True):
data.append(i.get('src'))
Alternative with css selectors
:
for i in soup.select('script[src]'):
data.append(i.get('src'))
And as allready mentioned with list comprehension
:
[i.get('src') for i in soup.select('script[src]')]
Output
['jquery.js', 'nicepage.js', '/static/js/2.c20455e8.chunk.js', '/static/js/main.87864e1d.chunk.js']