i am trying to scrape only the vulnerabilities published on the current day from nvd.nist.gov. In then example below the url with the cve-2022-25890 has the published date on 01/09/2023 but with "if currentday in date or if currentday == date" i get only no match why?
import requests
from bs4 import BeautifulSoup
url = "https://nvd.nist.gov/vuln/detail/CVE-2022-25890"
r = requests.get(url)
soup = BeautifulSoup(r.content, "html.parser")
VDP = soup.find(id="vulnDetailPanel")
date = VDP.find_all("span", attrs={'data-testid':'vuln-published-on'})
print(date)
currentday = '"vuln-published-on">01/09/2023'
if currentday in date:
print("match")
else:print("no match")`
i was expecting this: [01/09/2023] match
instead i only get this: [01/09/2023] no match
I also changed currentday to '01/09/2023' or '[01/09/2023]' with the same outcome
CodePudding user response:
Use .find()
to find only first element (instead of .find_all()
) and also use .text
property:
import requests
from bs4 import BeautifulSoup
url = "https://nvd.nist.gov/vuln/detail/CVE-2022-25890"
r = requests.get(url)
soup = BeautifulSoup(r.content, "html.parser")
VDP = soup.find(id="vulnDetailPanel")
date = VDP.find("span", attrs={"data-testid": "vuln-published-on"}).text # <-- use find() and .text
currentday = "01/09/2023"
if currentday in date:
print("match")
else:
print("no match")
Prints:
match