For example I have the next html code:
...
<tr data-year="Month">...</tr>
<tr data-year="Month">...</tr>
<tr data-year="Month">...</tr>
...
<tr data-year="Month">...</tr>
<td title="" data-x-key="name">June</td>
<td title="" data-x-key="volume">100</td>
<td title="" data-x-key="date">06/27/2022</td>
</tr>
...
<tr data-year="Month">...</tr>
...
and i have parsing code but I want to change it and my question is how can use the -> data-x-key
and to not use duplicates -> find_next('td', class_='month')
...
soup = BeautifulSoup(html, 'html.parser')
item = soup.find_all('tr', class_='main')
data = []
for i in item:
data.append({
'name': i.find('td', class_='month').get_text(),
'volume': i.find('td', class_='month').find_next('td', class_='month').get_text(),
'date': i.find('td', class_='month').find_next('td', class_='month').find_next('td',
class_='month').get_text()
})
print(data)
...
CodePudding user response:
Try with CSS selectors
html='''
<tr data-year="Month">
<td title="" data-x-key="name">June</td>
<td title="" data-x-key="volume">100</td>
<td title="" data-x-key="date">06/27/2022</td>
</tr>
'''
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
item = soup.find_all('tr', class_='main')
#print(item)
data = []
for i in item:
data.append({
'name': i.select_one('td[data-x-key="name"]').get_text(),
'volume': i.select_one('td[data-x-key="volume"]').get_text(),
'date': i.select_one('td[data-x-key="date"]').get_text()})
print(data)
Output:
[{'name': 'June', 'volume': '100', 'date': '06/27/2022'}]