<tbody><tr><td></td><td></td><td></td><td></td><td></td><td></td><td><a href="javascript:fnStep2('20221001','210','7','0','Y')" class='day-wrap'><span class='day'>01</span><span class='status stand-by'>대기신청가능</span></a></td></tr><tr><td><a href="javascript:fnStep2('20221002','210','1','0','Y')" class='day-wrap'><span class='day'>02</span><span class='status stand-by'>대기신청가능</span></a></td><td><div class='day-wrap '><span class='day'>03</span><span class='status'>미오픈</span></div></td><td><div class='day-wrap '><span class='day'>04</span><span class='status stand-by'>확정예약</span></div></td><td><a href="javascript:fnStep2('20221005','110','4','0','Y')" class='day-wrap'><span class='day'>05</span><span class='status stand-by'>대기신청가능</span></a></td><td><a href="javascript:fnStep2('20221006','110','5','0','Y')" class='day-wrap'><span class='day'>06</span><span class='status stand-by'>대기신청가능</span></a></td><td><div class='day-wrap '><span class='day'>07</span><span class='status'>미오픈</span></div></td><td><a href="javascript:fnStep2('20221008','210','7','0','Y')" class='day-wrap'><span class='day'>08</span><span class='status stand-by'>대기신청가능</span></a></td></tr><tr><td><a href="javascript:fnStep2('20221009','210','1','50','Y')" class='day-wrap'><span class='day'>09</span><span class='status stand-by'>대기신청가능</span></a></td><td><div class='day-wrap '><span class='day'>10</span><span class='status'>미오픈</span></div></td><td><div class='day-wrap '><span class='day'>11</span><span class='status'>미오픈</span></div></td><td><div class='day-wrap '><span class='day'>12</span><span class='status'>미오픈</span></div></td><td><div class='day-wrap '><span class='day'>13</span><span class='status'>미오픈</span></div></td><td><div class='day-wrap '><span class='day'>14</span><span class='status'>미오픈</span></div></td><td><a href="javascript:fnStep2('20221015','210','7','0','Y')" class='day-wrap'><span class='day'>15</span><span class='status stand-by'>대기신청가능</span></a></td></tr><tr><td><a href="javascript:fnStep2('20221016','210','1','0','Y')" class='day-wrap'><span class='day'>16</span><span class='status stand-by'>대기신청가능</span></a></td><td><div class='day-wrap '><span class='day'>17</span><span class='status'>미오픈</span></div></td><td><div class='day-wrap '><span class='day'>18</span><span class='status'>미오픈</span></div></td><td><div class='day-wrap '><span class='day'>19</span><span class='status'>미오픈</span></div></td><td><div class='day-wrap '><span class='day'>20</span><span class='status'>미오픈</span></div></td><td><div class='day-wrap '><span class='day'>21</span><span class='status'>미오픈</span></div></td><td><a href="javascript:fnStep2('20221022','210','7','0','Y')" class='day-wrap'><span class='day'>22</span><span class='status stand-by'>대기신청가능</span></a></td></tr><tr><td><a href="javascript:fnStep2('20221023','210','1','0','Y')" class='day-wrap'><span class='day'>23</span><span class='status stand-by'>대기신청가능</span></a></td><td><div class='day-wrap '><span class='day'>24</span><span class='status'>미오픈</span></div></td><td><div class='day-wrap '><span class='day'>25</span><span class='status'>미오픈</span></div></td><td><div class='day-wrap '><span class='day'>26</span><span class='status'>미오픈</span></div></td><td><div class='day-wrap '><span class='day'>27</span><span class='status'>미오픈</span></div></td><td><div class='day-wrap '><span class='day'>28</span><span class='status'>미오픈</span></div></td><td><a href="javascript:fnStep2('20221029','210','7','44','Y')" class='day-wrap'><span class='day'>29</span><span class='status stand-by'>대기신청가능</span></a></td></tr><tr><td><a href="javascript:fnStep2('20221030','210','1','44','Y')" class='day-wrap'><span class='day'>30</span><span class='status stand-by'>대기신청가능</span></a></td><td><div class='day-wrap '><span class='day'>31</span><span class='status'>미오픈</span></div></td><td></td><td></td><td></td><td></td><td></td></tr></tbody>
I am trying to sort upper html data just like:
{'01': ['20221001','210','7','0','Y'], '02': ['20221002','210','1','0','Y'], '03': [], ...}
I need to get javascript function html to list form.
how can I ?
CodePudding user response:
Try:
from ast import literal_eval
from bs4 import BeautifulSoup
html_doc = """ <tbody><tr><td></td><td></td><td></td><td></td><td></td><td></td><td><a href="javascript:fnStep2('20221001','210','7','0','Y')" class='day-wrap'><span class='day'>01</span><span class='status stand-by'>대기신청가능</span></a></td></tr><tr><td><a href="javascript:fnStep2('20221002','210','1','0','Y')" class='day-wrap'><span class='day'>02</span><span class='status stand-by'>대기신청가능</span></a></td><td><div class='day-wrap '><span class='day'>03</span><span class='status'>미오픈</span></div></td><td><div class='day-wrap '><span class='day'>04</span><span class='status stand-by'>확정예약</span></div></td><td><a href="javascript:fnStep2('20221005','110','4','0','Y')" class='day-wrap'><span class='day'>05</span><span class='status stand-by'>대기신청가능</span></a></td><td><a href="javascript:fnStep2('20221006','110','5','0','Y')" class='day-wrap'><span class='day'>06</span><span class='status stand-by'>대기신청가능</span></a></td><td><div class='day-wrap '><span class='day'>07</span><span class='status'>미오픈</span></div></td><td><a href="javascript:fnStep2('20221008','210','7','0','Y')" class='day-wrap'><span class='day'>08</span><span class='status stand-by'>대기신청가능</span></a></td></tr><tr><td><a href="javascript:fnStep2('20221009','210','1','50','Y')" class='day-wrap'><span class='day'>09</span><span class='status stand-by'>대기신청가능</span></a></td><td><div class='day-wrap '><span class='day'>10</span><span class='status'>미오픈</span></div></td><td><div class='day-wrap '><span class='day'>11</span><span class='status'>미오픈</span></div></td><td><div class='day-wrap '><span class='day'>12</span><span class='status'>미오픈</span></div></td><td><div class='day-wrap '><span class='day'>13</span><span class='status'>미오픈</span></div></td><td><div class='day-wrap '><span class='day'>14</span><span class='status'>미오픈</span></div></td><td><a href="javascript:fnStep2('20221015','210','7','0','Y')" class='day-wrap'><span class='day'>15</span><span class='status stand-by'>대기신청가능</span></a></td></tr><tr><td><a href="javascript:fnStep2('20221016','210','1','0','Y')" class='day-wrap'><span class='day'>16</span><span class='status stand-by'>대기신청가능</span></a></td><td><div class='day-wrap '><span class='day'>17</span><span class='status'>미오픈</span></div></td><td><div class='day-wrap '><span class='day'>18</span><span class='status'>미오픈</span></div></td><td><div class='day-wrap '><span class='day'>19</span><span class='status'>미오픈</span></div></td><td><div class='day-wrap '><span class='day'>20</span><span class='status'>미오픈</span></div></td><td><div class='day-wrap '><span class='day'>21</span><span class='status'>미오픈</span></div></td><td><a href="javascript:fnStep2('20221022','210','7','0','Y')" class='day-wrap'><span class='day'>22</span><span class='status stand-by'>대기신청가능</span></a></td></tr><tr><td><a href="javascript:fnStep2('20221023','210','1','0','Y')" class='day-wrap'><span class='day'>23</span><span class='status stand-by'>대기신청가능</span></a></td><td><div class='day-wrap '><span class='day'>24</span><span class='status'>미오픈</span></div></td><td><div class='day-wrap '><span class='day'>25</span><span class='status'>미오픈</span></div></td><td><div class='day-wrap '><span class='day'>26</span><span class='status'>미오픈</span></div></td><td><div class='day-wrap '><span class='day'>27</span><span class='status'>미오픈</span></div></td><td><div class='day-wrap '><span class='day'>28</span><span class='status'>미오픈</span></div></td><td><a href="javascript:fnStep2('20221029','210','7','44','Y')" class='day-wrap'><span class='day'>29</span><span class='status stand-by'>대기신청가능</span></a></td></tr><tr><td><a href="javascript:fnStep2('20221030','210','1','44','Y')" class='day-wrap'><span class='day'>30</span><span class='status stand-by'>대기신청가능</span></a></td><td><div class='day-wrap '><span class='day'>31</span><span class='status'>미오픈</span></div></td><td></td><td></td><td></td><td></td><td></td></tr></tbody>"""
soup = BeautifulSoup(html_doc, "html.parser")
out = {}
for i, a in enumerate(soup.select("a[href^='javascript:fnStep2']"), 1):
t = list(literal_eval(a["href"].replace("javascript:fnStep2", "")))
out["{:>02}".format(i)] = t
print(out)
Prints:
{
"01": ["20221001", "210", "7", "0", "Y"],
"02": ["20221002", "210", "1", "0", "Y"],
"03": ["20221005", "110", "4", "0", "Y"],
"04": ["20221006", "110", "5", "0", "Y"],
"05": ["20221008", "210", "7", "0", "Y"],
"06": ["20221009", "210", "1", "50", "Y"],
"07": ["20221015", "210", "7", "0", "Y"],
"08": ["20221016", "210", "1", "0", "Y"],
"09": ["20221022", "210", "7", "0", "Y"],
"10": ["20221023", "210", "1", "0", "Y"],
"11": ["20221029", "210", "7", "44", "Y"],
"12": ["20221030", "210", "1", "44", "Y"],
}
CodePudding user response:
##You can get all the href from a tage and mapped that data to your desire form.
result_response = response.css("a::attr(href)").getAll()
count = 0
response_array = []
for data in result_response:
response_array.append(data.css("a::attr(href)::text".get()))
return response_array
CodePudding user response:
from pprint import pp
from bs4 import BeautifulSoup
html = """<tbody>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td><a href="javascript:fnStep2('20221001','210','7','0','Y')" class='day-wrap'><span class='day'>01</span><span class='status stand-by'>대기신청가능</span></a></td>
</tr>
<tr>
<td><a href="javascript:fnStep2('20221002','210','1','0','Y')" class='day-wrap'><span class='day'>02</span><span class='status stand-by'>대기신청가능</span></a></td>
<td>
<div class='day-wrap '><span class='day'>03</span><span class='status'>미오픈</span></div>
</td>
<td>
<div class='day-wrap '><span class='day'>04</span><span class='status stand-by'>확정예약</span></div>
</td>
<td><a href="javascript:fnStep2('20221005','110','4','0','Y')" class='day-wrap'><span class='day'>05</span><span class='status stand-by'>대기신청가능</span></a></td>
<td><a href="javascript:fnStep2('20221006','110','5','0','Y')" class='day-wrap'><span class='day'>06</span><span class='status stand-by'>대기신청가능</span></a></td>
<td>
<div class='day-wrap '><span class='day'>07</span><span class='status'>미오픈</span></div>
</td>
<td><a href="javascript:fnStep2('20221008','210','7','0','Y')" class='day-wrap'><span class='day'>08</span><span class='status stand-by'>대기신청가능</span></a></td>
</tr>
<tr>
<td><a href="javascript:fnStep2('20221009','210','1','50','Y')" class='day-wrap'><span class='day'>09</span><span class='status stand-by'>대기신청가능</span></a></td>
<td>
<div class='day-wrap '><span class='day'>10</span><span class='status'>미오픈</span></div>
</td>
<td>
<div class='day-wrap '><span class='day'>11</span><span class='status'>미오픈</span></div>
</td>
<td>
<div class='day-wrap '><span class='day'>12</span><span class='status'>미오픈</span></div>
</td>
<td>
<div class='day-wrap '><span class='day'>13</span><span class='status'>미오픈</span></div>
</td>
<td>
<div class='day-wrap '><span class='day'>14</span><span class='status'>미오픈</span></div>
</td>
<td><a href="javascript:fnStep2('20221015','210','7','0','Y')" class='day-wrap'><span class='day'>15</span><span class='status stand-by'>대기신청가능</span></a></td>
</tr>
<tr>
<td><a href="javascript:fnStep2('20221016','210','1','0','Y')" class='day-wrap'><span class='day'>16</span><span class='status stand-by'>대기신청가능</span></a></td>
<td>
<div class='day-wrap '><span class='day'>17</span><span class='status'>미오픈</span></div>
</td>
<td>
<div class='day-wrap '><span class='day'>18</span><span class='status'>미오픈</span></div>
</td>
<td>
<div class='day-wrap '><span class='day'>19</span><span class='status'>미오픈</span></div>
</td>
<td>
<div class='day-wrap '><span class='day'>20</span><span class='status'>미오픈</span></div>
</td>
<td>
<div class='day-wrap '><span class='day'>21</span><span class='status'>미오픈</span></div>
</td>
<td><a href="javascript:fnStep2('20221022','210','7','0','Y')" class='day-wrap'><span class='day'>22</span><span class='status stand-by'>대기신청가능</span></a></td>
</tr>
<tr>
<td><a href="javascript:fnStep2('20221023','210','1','0','Y')" class='day-wrap'><span class='day'>23</span><span class='status stand-by'>대기신청가능</span></a></td>
<td>
<div class='day-wrap '><span class='day'>24</span><span class='status'>미오픈</span></div>
</td>
<td>
<div class='day-wrap '><span class='day'>25</span><span class='status'>미오픈</span></div>
</td>
<td>
<div class='day-wrap '><span class='day'>26</span><span class='status'>미오픈</span></div>
</td>
<td>
<div class='day-wrap '><span class='day'>27</span><span class='status'>미오픈</span></div>
</td>
<td>
<div class='day-wrap '><span class='day'>28</span><span class='status'>미오픈</span></div>
</td>
<td><a href="javascript:fnStep2('20221029','210','7','44','Y')" class='day-wrap'><span class='day'>29</span><span class='status stand-by'>대기신청가능</span></a></td>
</tr>
<tr>
<td><a href="javascript:fnStep2('20221030','210','1','44','Y')" class='day-wrap'><span class='day'>30</span><span class='status stand-by'>대기신청가능</span></a></td>
<td>
<div class='day-wrap '><span class='day'>31</span><span class='status'>미오픈</span></div>
</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>"""
soup = BeautifulSoup(html, 'lxml')
goal = [x['href'].split("'")[1::2] for x in soup.select('a.day-wrap')]
pp(goal)
Output:
[['20221001', '210', '7', '0', 'Y'],
['20221002', '210', '1', '0', 'Y'],
['20221005', '110', '4', '0', 'Y'],
['20221006', '110', '5', '0', 'Y'],
['20221008', '210', '7', '0', 'Y'],
['20221009', '210', '1', '50', 'Y'],
['20221015', '210', '7', '0', 'Y'],
['20221016', '210', '1', '0', 'Y'],
['20221022', '210', '7', '0', 'Y'],
['20221023', '210', '1', '0', 'Y'],
['20221029', '210', '7', '44', 'Y'],
['20221030', '210', '1', '44', 'Y']]