I have the next html code:
...
<tr >...</tr>
...
<tr >
<td >JJJ</td>
<td >18</td>
<td >
**<span >20%</span>
<span >-15%</span>**
</td>
<td >02/06/2022</td>
</tr>
...
<tr >...</tr>
...
I need to resolve the situation with <span >20%</span>
and <span >-15%</span>
. In the html code, you can see these two lines, but in real time it works like this: if the value is negative, html only shows the line<span >-15%</span>
, if the value is positive, it only shows the line<span >20%</span>
I wrote the parsing code, but how to solve this problem using if else
and checking span class name or maybe some other way to fix it:
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
item = soup.find_all('tr', class_='main')
data = []
for i in item:
data.append({
'percent': i.find('td', class_='int').find_next('td', class_='int').find_next('td', class_='int').get_text()
print(data)
CodePudding user response:
In my opinion there is no need to check for the class
cause position of the value is always the same - So simply extract the values from the <td>
and store it in your list of dicts:
data.append(dict(zip(['Name','Amount','Percentage','Date'],row.stripped_strings)))
But to answer your question simply use comma (,) to join multiple selectors in a list. When presented with a selector list, any selector in the list that matches an element will return that element - cause there is only one of the classes it will pick the right one:
data.append({'percentag':row.select_one('.plus,.minus').text})
Example
from bs4 import BeautifulSoup
html='''
<tr >
<td >AAA</td>
<td >18</td>
<td >
<span >-15%</span>
</td>
<td >02/06/2022</td>
</tr>
<tr >
<td >BBB</td>
<td >18</td>
<td >
<span >20%</span>
</td>
<td >02/06/2022</td>
</tr>
'''
soup = BeautifulSoup(html)
data = []
for row in soup.select('tr'):
data.append(dict(zip(['Name','Amount','Percentage','Date'],row.stripped_strings)))
data
Output
[{'Name': 'AAA', 'Amount': '18', 'Percentage': '-15%', 'Date': '02/06/2022'}, {'Name': 'BBB', 'Amount': '18', 'Percentage': '20%', 'Date': '02/06/2022'}]