I'm trying to make python script that gets all the grades of the student using requests and bs4. Now i have a problem looping the values
for rows in tr:
td = tbody.find_all('td')
subject.append(td[0].get_text())
fq.append(td[1].get_text())
sq.append(td[2].get_text())
ave.append(td[3].get_text())
for i in subject:
print(f"Subject: {i}")
for i in fq:
print(f"First Quarter: {i}")
for i in sq:
print(f"Second Quarter: {i}")
for i in ave:
print(f"Average: {i}")
# here my goal is there are 4 list and are all connected like all the first value of the subject list, f_quar, s_quar and the average are linked together, like gen math(subject), 90(f_qaur), 90(s_qaur), and 90(average)
Output:
Subject: GENERAL MATHEMATICS
Subject: GENERAL MATHEMATICS
Subject: GENERAL MATHEMATICS
Subject: GENERAL MATHEMATICS
Subject: GENERAL MATHEMATICS
Subject: GENERAL MATHEMATICS
Subject: GENERAL MATHEMATICS
Subject: GENERAL MATHEMATICS
First Quarter: ##.00
First Quarter: ##.00
First Quarter: ##.00
First Quarter: ##.00
First Quarter: ##.00
First Quarter: ##.00
First Quarter: ##.00
First Quarter: ##.00
Second Quarter: ##.00
Second Quarter: ##.00
Second Quarter: ##.00
Second Quarter: ##.00
Second Quarter: ##.00
Second Quarter: ##.00
Second Quarter: ##.00
Average: ##.00
Average: ##.00
Average: ##.00
Average: ##.00
Average: ##.00
Average: ##.00
Average: ##.00
Average: ##.00
Expected Output:
Subject: Gen Math
Subject: Stats
...
First Quarter: 90.00
First Quarter: 90.00
...
Second Quarter: 90.00
Second Quarter: 90.00
...
Average: 90.00
Average: 90.00
...
Im new at pyton so loops is my weakness. Also the code seems so wrong since i need the subject, 1stQ grade, 2ndQ grade and the average. Thanks!. This is the html code of the table:
<table cellspacing="0" id="tblss1" width="100%">
<thead>
<tr >
<th style="text-align:center">SUBJECT</th>
<th style="text-align:center">1ST</th>
<th style="text-align:center">2ND</th>
<th style="text-align:center">AVE</th>
</tr>
</thead>
<tbody>
<tr>
<td style="color:purple"> GENERAL MATHEMATICS </td>
<td align="center"> <strong> ##.00 </strong></td>
<td align="center"> <strong> ##.00 </strong></td>
<td align="center" style="color:blueviolet"> <strong> ##.00 </strong></td>
</tr>
<tr>
<td style="color:purple"> EARTH SCIENCE </td>
<td align="center"> <strong> ##.00 </strong></td>
<td align="center"> <strong> ##.00 </strong></td>
<td align="center" style="color:blueviolet"> <strong> ##.00 </strong></td>
</tr>
<tr>
<td style="color:purple"> PHYSICAL EDUCATION AND HEALTH </td>
<td align="center"> <strong> ##.00 </strong></td>
<td align="center"> <strong> ##.00 </strong></td>
<td align="center" style="color:blueviolet"> <strong> ##.50 </strong></td>
</tr>
<tr>
<td style="color:purple"> GENERAL CHEMISTRY 1 </td>
<td align="center"> <strong> ##.00 </strong></td>
<td align="center"> <strong> ##.00 </strong></td>
<td align="center" style="color:blueviolet"> <strong> ##.00 </strong></td>
</tr>
<tr>
<td style="color:purple"> 21ST CENTURY LITERATURE </td>
<td align="center"> <strong> ##.00 </strong></td>
<td align="center"> <strong> ##.00 </strong></td>
<td align="center" style="color:blueviolet"> <strong> ##.00 </strong></td>
</tr>
<tr>
<td style="color:purple"> READING AND WRITING </td>
<td align="center"> <strong> ##.00 </strong></td>
<td align="center"> <strong> ##.00 </strong></td>
<td align="center" style="color:blueviolet"> <strong> ##.00 </strong></td>
</tr>
<tr>
<td style="color:purple"> GENERAL BIOLOGY 1 </td>
<td align="center"> <strong> ##.00 </strong></td>
<td align="center"> <strong> ##.00 </strong></td>
<td align="center" style="color:blueviolet"> <strong> ##.00 </strong></td>
</tr>
<tr>
<td style="color:purple"> ENTREPRENEURSHIP </td>
<td align="center"> <strong> ##.00 </strong></td>
<td align="center"> <strong> ##.00 </strong></td>
<td align="center" style="color:blueviolet"> <strong> ##.50 </strong></td>
</tr>
</tbody>
</table>
CodePudding user response:
Based on my understanding, here is what you're trying to acheive. I'm assuming the first for loops actually adds all the data properly.
subjects = []
fq = []
sq = []
avgs = []
for rows in tr:
td = tbody.find_all('td')
subjects.append(td[0].get_text())
fq.append(td[1].get_text())
sq.append(td[2].get_text())
avgs.append(td[3].get_text())
for subject in subjects:
print(subject)
for f in fq:
print(f)
for s in sq:
print(s)
for a in avgs:
print(a)
CodePudding user response:
You use i as index twice (outer and inner loop).
I am not sure if the interpreter can handle that "override" of the variable so easily, because it might do it but after returning to the outer loop the object/iterator-cursor in i could be gone.
Try changing the inner loop index variable name to not override i from the outer loop.
If this does not solve your issue please describe in more detail what you try to achieve or what the seen behavior is.
*Post Edit: This way you will only get the same results for all entrys. You need to build a double loop doing the following steps:
- find all tr blocks and iterating over them
for tr_block in tbody.find_all('tr')
- in each tr_block append the corresponding td blocks to their lists
td = tr_block.find_all('td')
subject.append(td[0].get_text()) #[...]
- after that you should have lists filled with all data from the html which you then can zip together to sets if needed.
CodePudding user response:
In cases like this, it's simpler and faster to read the table into a dataframe:
import pandas as pd
table = """[your html above]"""
print(pd.read_html(table)
Output:
SUBJECT 1ST 2ND AVE
0 GENERAL MATHEMATICS ##.00 ##.00 ##.00
1 EARTH SCIENCE ##.00 ##.00 ##.00
2 PHYSICAL EDUCATION AND HEALTH ##.00 ##.00 ##.50
3 GENERAL CHEMISTRY 1 ##.00 ##.00 ##.00
etc.