When I try to get the source code for a website, I would usually use urllib.request.urlopen(url)
. However, when I try to do it for a website with a table (that I believe generated after javascript code is run), it does not return the same as what I see when I press inspect
on a website.
How can I get what I see in inspect
after the page is loaded?
I have tried using json
backage and load the page using json.load(urllib.request.urlopen(url))
, I would get an error:
I also tried to use selenium Webdriver, but when I call webdrive.Chrome()
an error is generated:
WebDriverException: Message: Service chromedriver unexpectedly exited. Status code was: 127
How can I fix either of these solutions? Or is there a better solution? Any help would be appeciated.
I'm using replit if that matters.
CodePudding user response:
From what I can see in the network logs, the site seems to be using an API to load the contents of the table. If you copy the API URL and break down the query:
# from urllib.parse import urlsplit, parse_qsl
apiUrl = 'https://datacenter-web.eastmoney.com/api/data/v1/get?callback=jQuery112309834864079678796_1673633495739&sortColumns=SECURITY_CODE,TRADE_DATE&sortTypes=1,-1&pageSize=50&pageNumber=1&reportName=RPT_DAILYBILLBOARD_DETAILSNEW&columns=SECURITY_CODE,SECUCODE,SECURITY_NAME_ABBR,TRADE_DATE,EXPLAIN,CLOSE_PRICE,CHANGE_RATE,BILLBOARD_NET_AMT,BILLBOARD_BUY_AMT,BILLBOARD_SELL_AMT,BILLBOARD_DEAL_AMT,ACCUM_AMOUNT,DEAL_NET_RATIO,DEAL_AMOUNT_RATIO,TURNOVERRATE,FREE_MARKET_CAP,EXPLANATION,D1_CLOSE_ADJCHRATE,D2_CLOSE_ADJCHRATE,D5_CLOSE_ADJCHRATE,D10_CLOSE_ADJCHRATE,SECURITY_TYPE_CODE&source=WEB&client=WEB&filter=(TRADE_DATE<='2023-01-13')(TRADE_DATE>='2023-01-13')'
dict(parse_qsl(urlsplit(url).query))
you can see the parameters
{'callback': 'jQuery112309834864079678796_1673633495739',
'sortColumns': 'SECURITY_CODE,TRADE_DATE',
'sortTypes': '1,-1',
'pageSize': '50',
'pageNumber': '1',
'reportName': 'RPT_DAILYBILLBOARD_DETAILSNEW',
'columns': 'SECURITY_CODE,SECUCODE,SECURITY_NAME_ABBR,TRADE_DATE,EXPLAIN,CLOSE_PRICE,CHANGE_RATE,BILLBOARD_NET_AMT,BILLBOARD_BUY_AMT,BILLBOARD_SELL_AMT,BILLBOARD_DEAL_AMT,ACCUM_AMOUNT,DEAL_NET_RATIO,DEAL_AMOUNT_RATIO,TURNOVERRATE,FREE_MARKET_CAP,EXPLANATION,D1_CLOSE_ADJCHRATE,D2_CLOSE_ADJCHRATE,D5_CLOSE_ADJCHRATE,D10_CLOSE_ADJCHRATE,SECURITY_TYPE_CODE',
'source': 'WEB',
'client': 'WEB',
'filter': "(TRADE_DATE<='2023-01-13')(TRADE_DATE>='2023-01-13')"}
(Btw, from looking the request's initiator, the tradedetail script seems to be mostly responsible for generating the parameters.)
You can reform the link
start, end, pgSize, pgNum = '2023-01-13', '2023-01-13', 50, 1
auQstr = 'sortColumns=SECURITY_CODE,TRADE_DATE&sortTypes=1,-1'
auQstr = f'&pageSize={pgSize}&pageNumber={pgNum}' ## max pageSize seems to be 500
auQstr = '&reportName=RPT_DAILYBILLBOARD_DETAILSNEW&columns=SECURITY_CODE,SECUCODE,SECURITY_NAME_ABBR,TRADE_DATE,EXPLAIN,CLOSE_PRICE,CHANGE_RATE,BILLBOARD_NET_AMT,BILLBOARD_BUY_AMT,BILLBOARD_SELL_AMT,BILLBOARD_DEAL_AMT,ACCUM_AMOUNT,DEAL_NET_RATIO,DEAL_AMOUNT_RATIO,TURNOVERRATE,FREE_MARKET_CAP,EXPLANATION,D1_CLOSE_ADJCHRATE,D2_CLOSE_ADJCHRATE,D5_CLOSE_ADJCHRATE,D10_CLOSE_ADJCHRATE,SECURITY_TYPE_CODE&source=WEB&client=WEB'
auQstr = f'&filter=(TRADE_DATE<='{end}')(TRADE_DATE>='{start}')'
apiUrl = f'https://datacenter-web.eastmoney.com/api/data/v1/get?{auQstr}'
and retrieve table data with
# import requests, json, pandas
apiReq = requests.get(apiUrl)
# print(apiReq.status_code, apiReq.reason, 'from', apiReq.url)
# apiReq.raise_for_status()
jDict = {}
try: jDict = json.loads(apiReq.text.strip().strip('jQuery_(0123456789);'))
except: print('failed to extract JSON from', apiReq.text)
rDict = jDict.get('result') if hasattr(jDict.get('result', {}), 'get') else {}
# for k, v in jDict.items(): print(f'{k}:', type(v) if hasattr(v, 'pop') else v)
# for k, v in rDict.items(): print(f'result_{k}:', type(v) if hasattr(v, 'pop') else v)
[I couldn't quite figure out how to generate the callback
parameter, but I don't see any difference in the data returned either way except that with the callback, the response is wrapped like {callback}({JSON});
(that's why I added the .strip('jQuery_(0123456789);')
part).]
Now pandas.DataFrame(rDict.get('data', []))
should return a DataFrame that looks like
SECURITY_CODE | SECUCODE | SECURITY_NAME_ABBR | TRADE_DATE | EXPLAIN | CLOSE_PRICE | CHANGE_RATE | BILLBOARD_NET_AMT | BILLBOARD_BUY_AMT | BILLBOARD_SELL_AMT | BILLBOARD_DEAL_AMT | ACCUM_AMOUNT | DEAL_NET_RATIO | DEAL_AMOUNT_RATIO | TURNOVERRATE | FREE_MARKET_CAP | EXPLANATION | D1_CLOSE_ADJCHRATE | D2_CLOSE_ADJCHRATE | D5_CLOSE_ADJCHRATE | D10_CLOSE_ADJCHRATE | SECURITY_TYPE_CODE | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 000670 | 000670.SZ | 盈方微 | 2023-01-13 00:00:00 | 实力游资买入,成功率36.52% | 8.61 | 9.9617 | 107855670.84 | 151940017.84 | 44084347.0 | 196024364.84 | 728666824 | 14.801781457255 | 26.901782595773 | 14.2053 | 5250783634.32 | 日涨幅偏离值达到7%的前5只证券 | 058001001 | ||||
1 | 000716 | 000716.SZ | 黑芝麻 | 2023-01-13 00:00:00 | 2家机构卖出,成功率35.42% | 9.66 | -9.972 | -83920955.05 | 126116524.26 | 210037479.31 | 336154003.57 | 1793050068 | -4.680346441391 | 18.747608311069 | 26.4683 | 6722651062.26 | 日跌幅偏离值达到7%的前5只证券 | 058001001 | ||||
2 | 001298 | 001298.SZ | 好上好 | 2023-01-13 00:00:00 | 2家机构买入,成功率29.10% | 46.97 | 1.8872 | -11553872.19 | 45951665.33 | 57505537.52 | 103457202.85 | 617385681 | -1.871418878923 | 16.757305203844 | 54.3025 | 1127280000.0 | 日换手率达到20%的前5只证券 | 058001001 | ||||
3 | 002043 | 002043.SZ | 兔宝宝 | 2023-01-13 00:00:00 | 2家机构卖出,成功率38.08% | 16.6 | -1.7751 | -72462039.78 | 212188958.86 | 284650998.64 | 496839957.5 | 3192972469 | -2.269422629964 | 15.560420965847 | 25.9004 | 11509797762.6 | 日振幅值达到15%的前5只证券 | 058001001 | ||||
4 | 002137 | 002137.SZ | 实益达 | 2023-01-13 00:00:00 | 主力做T,成功率27.11% | 9.18 | 3.9638 | 211445.689999998 | 84152129.0 | 83940683.31 | 168092812.31 | 1478965892 | 0.014296860471 | 11.36556381856 | 40.9845 | 3585402921.06 | 日换手率达到20%的前5只证券 | 058001001 | ||||
5 | 002186 | 002186.SZ | 全聚德 | 2023-01-13 00:00:00 | 西藏自治区资金卖出,成功率29.46% | 16.84 | -9.9947 | 3151181.58999997 | 188737402.17 | 185586220.58 | 374323622.75 | 2312764542 | 0.136251725274 | 16.185115948998 | 19.3857 | 5187624392.2 | 连续三个交易日内,跌幅偏离值累计达到20%的证券 | 058001001 | ||||
6 | 002186 | 002186.SZ | 全聚德 | 2023-01-13 00:00:00 | 西藏自治区资金卖出,成功率42.54% | 16.84 | -9.9947 | 10443891.72 | 109080186.16 | 98636294.44 | 207716480.6 | 1031412946 | 1.012581019126 | 20.139022047916 | 19.3857 | 5187624392.2 | 日跌幅偏离值达到7%的前5只证券 | 058001001 | ||||
7 | 002217 | 002217.SZ | 合力泰 | 2023-01-13 00:00:00 | 实力游资买入,成功率36.62% | 3.05 | 10.1083 | 31285483.89 | 93345048.93 | 62059565.04 | 155404613.97 | 703558435 | 4.446749883682 | 22.088373365888 | 5.6364 | 9483332520.55 | 连续三个交易日内,涨幅偏离值累计达到20%的证券 | 058001001 | ||||
8 | 002235 | 002235.SZ | 安妮股份 | 2023-01-13 00:00:00 | 西藏自治区资金买入,成功率36.04% | 9.65 | 1.5789 | -49174252.39 | 182992053.05 | 232166305.44 | 415158358.49 | 2127085536 | -2.311813585197 | 19.51770868936 | 40.3385 | 5290486577.15 | 日换手率达到20%的前5只证券 | 058001001 | ||||
9 | 002238 | 002238.SZ | 天威视讯 | 2023-01-13 00:00:00 | 买一主买,成功率39.94% | 9.79 | 10.0 | 13041066.65 | 67516644.74 | 54475578.09 | 121992222.83 | 620322941 | 2.102302814882 | 19.665921533281 | 2.7782 | 7857054176.4 | 连续三个交易日内,涨幅偏离值累计达到20%的证券 | 058001001 | ||||
10 | 002400 | 002400.SZ | 省广集团 | 2023-01-13 00:00:00 | 主力做T,成功率41.83% | 4.86 | 9.9548 | 66233347.48 | 123818180.47 | 57584832.99 | 181403013.46 | 525010338 | 12.615627290753 | 34.552274561115 | 6.649 | 8199738933.66 | 日涨幅偏离值达到7%的前5只证券 | 058001001 | ||||
11 | 002467 | 002467.SZ | 二六三 | 2023-01-13 00:00:00 | 买一主买,成功率38.07% | 6.39 | 9.9828 | 113801422.15 | 251184968.9 | 137383546.75 | 388568515.65 | 2283893105 | 4.982782333414 | 17.013428290463 | 27.1756 | 8680735917.36 | 日涨幅偏离值达到7%的前5只证券 | 058001001 | ||||
12 | 002528 | 002528.SZ | 英飞拓 | 2023-01-13 00:00:00 | 浙江资金卖出,成功率34.64% | 11.46 | -9.9764 | -129544714.79 | 108721495.94 | 238266210.73 | 346987706.67 | 1542451167 | -8.398626650979 | 22.495863343595 | 12.7112 | 11992193816.46 | 日跌幅偏离值达到7%的前5只证券 | 058001001 | ||||
13 | 002560 | 002560.SZ | 通达股份 | 2023-01-13 00:00:00 | 浙江资金买入,成功率19.33% | 8.81 | 9.9875 | 17471183.83 | 81321592.83 | 63850409.0 | 145172001.83 | 510096412 | 3.425074832716 | 28.45971828361 | 9.977 | 3965177506.62 | 连续三个交易日内,涨幅偏离值累计达到20%的证券 | 058001001 | ||||
14 | 002576 | 002576.SZ | 通达动力 | 2023-01-13 00:00:00 | 2家机构买入,成功率21.82% | 22.5 | 5.2878 | -19311555.5 | 128003798.4 | 147315353.9 | 275319152.3 | 1422318096 | -1.35775221832 | 19.357073011606 | 37.9045 | 3642812347.5 | 日换手率达到20%的前5只证券 | 058001001 | ||||
15 | 002762 | 002762.SZ | 金发拉比 | 2023-01-13 00:00:00 | 买一主买,成功率46.74% | 12.24 | 9.973 | 4698244.86 | 98721360.44 | 94023115.58 | 192744476.02 | 870220454 | 0.539891338845 | 22.148925037793 | 36.1724 | 2461689167.04 | 日涨幅偏离值达到7%的前5只证券 | 058001001 | ||||
16 | 002820 | 002820.SZ | 桂发祥 | 2023-01-13 00:00:00 | 3家机构卖出,成功率28.18% | 13.32 | -10.0 | -29093337.0 | 42460011.96 | 71553348.96 | 114013360.92 | 647947778 | -4.490074352875 | 17.596072521758 | 23.1511 | 2668469898.96 | 日跌幅偏离值达到7%的前5只证券 | 058001001 | ||||
17 | 002875 | 002875.SZ | 安奈儿 | 2023-01-13 00:00:00 | 主力做T,成功率9.32% | 24.78 | -9.9891 | -83358448.31 | 50606666.78 | 133965115.09 | 184571781.87 | 801943994 | -10.394547366608 | 23.015545131697 | 25.6788 | 3034961970.6 | 日跌幅偏离值达到7%的前5只证券 | 058001001 | ||||
18 | 003027 | 003027.SZ | 同兴环保 | 2023-01-13 00:00:00 | 主力做T,成功率18.30% | 32.86 | 10.01 | 41802137.99 | 162093207.65 | 120291069.66 | 282384277.31 | 866481721 | 4.824353125621 | 32.589755844371 | 40.7896 | 2192047882.0 | 日涨幅偏离值达到7%的前5只证券 | 058001001 | ||||
19 | 003027 | 003027.SZ | 同兴环保 | 2023-01-13 00:00:00 | 主力做T,成功率18.30% | 32.86 | 10.01 | 41802137.99 | 162093207.65 | 120291069.66 | 282384277.31 | 866481721 | 4.824353125621 | 32.589755844371 | 40.7896 | 2192047882.0 | 日换手率达到20%的前5只证券 | 058001001 | ||||
20 | 300492 | 300492.SZ | 华图山鼎 | 2023-01-13 00:00:00 | 实力游资买入,成功率48.81% | 43.33 | 19.9945 | 1105052.6 | 23053326.1 | 21948273.5 | 45001599.6 | 114661901 | 0.963748717196 | 39.247212201723 | 1.9942 | 6087278745.1 | 日涨幅达到15%的前5只证券 | 058001001 | ||||
21 | 301297 | 301297.SZ | 富乐德 | 2023-01-13 00:00:00 | 主力做T,成功率2.22% | 18.32 | 0.3286 | -9034857.31 | 29853155.71 | 38888013.02 | 68741168.73 | 546884177 | -1.652060470932 | 12.569602782638 | 41.8968 | 1284169107.44 | 日换手率达到30%的前5只证券 | 058001001 | ||||
22 | 600523 | 600523.SH | 贵航股份 | 2023-01-13 00:00:00 | 普通席位卖出,成功率42.68% | 15.81 | -10.0171 | -28207457.6 | 13579393.0 | 41786850.6 | 55366243.6 | 85000595 | -33.185011940211 | 65.136301222362 | 0.9341 | 6387945442.2 | 非ST、*ST和S证券连续三个交易日内收盘价格跌幅偏离值累计达到20%的证券 | 058001001 | ||||
23 | 600523 | 600523.SH | 贵航股份 | 2023-01-13 00:00:00 | 普通席位卖出,成功率33.77% | 15.81 | -10.0171 | -16835436.6 | 12407688.0 | 29243124.6 | 41650812.6 | 59671683 | -28.213443552447 | 69.799962907029 | 0.9341 | 6387945442.2 | 有价格涨跌幅限制的日收盘价格跌幅偏离值达到7%的前五只证券 | 058001001 | ||||
24 | 600532 | 600532.SH | *ST未来 | 2023-01-13 00:00:00 | 普通席位卖出,成功率28.62% | 9.74 | -4.9756 | -24439826.67 | 40681630.15 | 65121456.82 | 105803086.97 | 409474673 | -5.968580789367 | 25.838737764863 | 2.2835 | 5026480112.8 | ST、*ST和S证券连续三个交易日内收盘价格跌幅偏离值累计达到15%的证券 | 058001001 | ||||
25 | 600705 | 600705.SH | 中航产融 | 2023-01-13 00:00:00 | 实力游资买入,成功率54.14% | 3.73 | 10.0295 | 73958025.45 | 133172433.75 | 59214408.3 | 192386842.05 | 686213131 | 10.777704784259 | 28.036018746776 | 2.1765 | 32840504627.09 | 有价格涨跌幅限制的日收盘价格涨幅偏离值达到7%的前五只证券 | 058001001 | ||||
26 | 600936 | 600936.SH | 广西广电 | 2023-01-13 00:00:00 | 买一主买,成功率41.53% | 4.24 | 10.1299 | 26749602.16 | 54209533.16 | 27459931.0 | 81669464.16 | 143394358 | 18.654570886255 | 56.954447370935 | 2.0507 | 7085151253.36 | 有价格涨跌幅限制的日收盘价格涨幅偏离值达到7%的前五只证券 | 058001001 | ||||
27 | 601136 | 601136.SH | 首创证券 | 2023-01-13 00:00:00 | 实力游资买入,成功率30.50% | 17.78 | 0.3953 | -93721279.84 | 97104768.64 | 190826048.48 | 287930817.12 | 1841283839 | -5.089996330544 | 15.637503084607 | 38.2112 | 4859874964.0 | 有价格涨跌幅限制的日换手率达到20%的前五只证券 | 058001001 | ||||
28 | 603177 | 603177.SH | 德创环保 | 2023-01-13 00:00:00 | 实力游资卖出,成功率49.47% | 15.64 | -10.0115 | -4465384.0 | 21174031.0 | 25639415.0 | 46813446.0 | 150194462 | -2.973068341228 | 31.168556667555 | 4.6327 | 3159280000.0 | 有价格涨跌幅限制的日收盘价格跌幅偏离值达到7%的前五只证券 | 058001001 | ||||
29 | 603180 | 603180.SH | 金牌厨柜 | 2023-01-13 00:00:00 | 1家机构买入,成功率50.16% | 36.25 | 10.0152 | 12202253.1 | 26500648.8 | 14298395.7 | 40799044.5 | 87351411 | 13.969153972796 | 46.706795039636 | 1.61 | 5591811972.5 | 有价格涨跌幅限制的日收盘价格涨幅偏离值达到7%的前五只证券 | 058001001 | ||||
30 | 603595 | 603595.SH | 东尼电子 | 2023-01-13 00:00:00 | 2家机构卖出,成功率46.97% | 74.61 | -7.6495 | -7432379.62000001 | 121505577.32 | 128937956.94 | 250443534.26 | 1050225268 | -0.707693848783 | 23.846649084813 | 7.496 | 13492188210.51 | 有价格涨跌幅限制的日收盘价格跌幅偏离值达到7%的前五只证券 | 058001001 | ||||
31 | 603633 | 603633.SH | 徕木股份 | 2023-01-13 00:00:00 | 1家机构卖出,成功率37.31% | 12.55 | -9.2552 | -73231606.97 | 86340428.24 | 159572035.21 | 245912463.45 | 290563460 | -25.203309105006 | 84.632962262357 | 7.0615 | 4120365975.7 | 有价格涨跌幅限制的日收盘价格跌幅偏离值达到7%的前五只证券 | 058001001 | ||||
32 | 603718 | 603718.SH | 海利生物 | 2023-01-13 00:00:00 | 1家机构买入,成功率42.60% | 10.87 | 10.0202 | 20370152.3 | 37268339.3 | 16898187.0 | 54166526.3 | 144941266 | 14.054073668709 | 37.371362756001 | 2.1191 | 7000280000.0 | 有价格涨跌幅限制的日收盘价格涨幅偏离值达到7%的前五只证券 | 058001001 | ||||
33 | 603818 | 603818.SH | 曲美家居 | 2023-01-13 00:00:00 | 1家机构买入,成功率42.58% | 7.22 | 10.061 | 5939930.2 | 66439130.2 | 60499200.0 | 126938330.2 | 304253824 | 1.952294344869 | 41.721194669356 | 7.4088 | 4190455842.12 | 有价格涨跌幅限制的日收盘价格涨幅偏离值达到7%的前五只证券 | 058001001 | ||||
34 | 605289 | 605289.SH | 罗曼股份 | 2023-01-13 00:00:00 | 1家机构买入,成功率44.60% | 29.99 | 4.6771 | 30283104.67 | 69216892.67 | 38933788.0 | 108150680.67 | 344547940 | 8.789228189842 | 31.389153181412 | 22.4486 | 1517793900.0 | 有价格涨跌幅限制的日换手率达到20%的前五只证券 | 058001001 | ||||
35 | 688176 | 688176.SH | 亚虹医药 | 2023-01-13 00:00:00 | 3家机构买入,成功率42.03% | 14.42 | 14.2631 | 82667412.7 | 260600938.14 | 177933525.44 | 438534463.58 | 992180700 | 8.331890823919 | 44.199052005345 | 13.6847 | 3371945228.96 | 有价格涨跌幅限制的连续3个交易日内收盘价格涨幅偏离值累计达到30%的证券 | 058001001 | ||||
36 | 688338 | 688338.SH | 赛科希德 | 2023-01-13 00:00:00 | 1家机构买入,成功率55.14% | 41.8 | 16.8904 | 35149627.58 | 52194475.91 | 17044848.33 | 69239324.24 | 175305000 | 20.050556219161 | 39.496491394997 | 8.9712 | 2047768498.6 | 有价格涨跌幅限制的日收盘价格涨幅达到15%的前五只证券 | 058001001 | ||||
37 | 688506 | 688506.SH | 百利天恒 | 2023-01-13 00:00:00 | 4家机构买入,成功率43.76% | 48.45 | 16.747 | 45177674.84 | 82744533.6 | 37566858.76 | 120311392.36 | 272655300 | 16.569520137698 | 44.125822003093 | 17.9553 | 1601916788.1 | 有价格涨跌幅限制的日收盘价格涨幅达到15%的前五只证券 | 058001001 |
The table above was printed with print(pandas.DataFrame(rDict.get('data', [])).to_markdown(disable_numparse=True))
.