Home > Back-end >  Getting empty list while web scrapping
Getting empty list while web scrapping

Time:10-04

I want to scrape data from a website. (which generate cards using javascript I tried this and i am using chrome driver just for information)

Title = []
driver.get(url)
content = driver.page_source
soup = BeautifulSoup(content,'html.parser')
for a in soup.findAll('div',attrs={'class':'card-body text-center'}):
    title = a.find('h4',attrs={'class':'card-title'})
    Title.append(title.text)
print(Title)

I am getting my list Title empty Website Code looks like below from which i try to scrap data

<div >
<h4 >
Title</h4>
<h7>Risen_star</h7>
<br> 
<h7>2022</h7>
<p style="height:57px" >
hi
</p>
<a href="/details" >
Read More
</a>
</div>

CodePudding user response:

It looks like things are working fine:

from bs4 import BeautifulSoup

Title = []
content = """<div >
<h4 >
Title</h4>
<h7>Risen_star</h7>
<br> 
<h7>2022</h7>
<p style="height:57px" >
hi
</p>
<a href="/details" >
Read More
</a>
</div>
"""
soup = BeautifulSoup(content,'html.parser')
for a in soup.findAll('div',attrs={'class':'card-body text-center'}):
    title = a.find('h4',attrs={'class':'card-title'})
    Title.append(title.text)
print(Title)

This results in:

$ python3 test.py    
['\nTitle']

May be you content variable is not correctly populated with the HTML page content ..

CodePudding user response:

data = []
soup = BeautifulSoup(content,'html.parser')
for a in soup.findAll('div',attrs={'class':'card-body text-center'}):
    element = {}
    title = a.find('h4',attrs={'class':'card-title'})
    sub_title = a.find('h7')
    element['title'] = title.text
    element['sub_title'] = sub_title.text
    data.append([element])
print(data)

Something like the above to add elements into a list and use .text

[[{'title': 'Title', 'sub_title': 'Risen_star'}]]
  • Related