I can get the json data in the url with browser.
https://stock.xueqiu.com/v5/stock/finance/us/income.json?symbol=ASX&type=all&is_detail=true&count=5
Try to crawl the web with requests library:
import requests
url="https://stock.xueqiu.com/v5/stock/finance/us/income.json?symbol=ASX&type=all&is_detail=true&count=5"
headers = {"User-Agent":"Mozilla/5.0 (X11; Linux x86_64; rv:74.0) Gecko/20100101 Firefox/74.0"}
r = requests.get(url, headers=headers)
<Response [400]>
Try to crawl the web with selenium library:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = webdriver.ChromeOptions()
browser = webdriver.Chrome(options=options)
url="https://stock.xueqiu.com/v5/stock/finance/us/income.json?symbol=ASX&type=all&is_detail=true&count=5"
browser.get(url)
browser.page_source
'<html><head></head><body><pre style="word-wrap: break-word; white-space: pre-wrap;">
{"error_description":"遇到错误,请刷新页面或者重新登录帐号后再试","error_uri":"/v5/stock
/finance/us/income.json","error_code":"400016"}</pre></body></html>'
How can get the json in the url?
I have never login the website,clear all caches in browser, and open the json's url ,it encountered error message,open the https://xueqiu.com/snowman/S/ASX/detail#/GSLRB
in browser ,wait for a moment and open the https://stock.xueqiu.com/v5/stock/finance/us/income.json?symbol=ASX&type=all&is_detail=true&count=5
again,all data shown in browser.
CodePudding user response:
Page https://stock.xueqiu.com/v5/stock/finance/us/income.json?symbol=ASX&type=all&is_detail=true&count=5
requires the cookies of page https://xueqiu.com/snowman/S/ASX/detail#/GSLRB
, so a possible solution is to use requests.Session
:
import requests
import pprint
headers = {
"User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:74.0) Gecko/20100101 Firefox/74.0"
}
url1 = "https://xueqiu.com/snowman/S/ASX/detail#/GSLRB"
url2 = "https://stock.xueqiu.com/v5/stock/finance/us/income.json?symbol=ASX&type=all&is_detail=true&count=5"
with requests.Session() as s:
s.headers.update(headers)
s.get(url1)
r = s.get(url2)
pprint.pprint(r.json())
Output:
{'data': {'annual_settle_date': '12-31',
'currency': 'TWD',
'currency_name': '新台币',
'last_report_name': '2020年FY',
'list': [{'ctime': 1617756162000,
'ed': '2020-12-31',
'gross_profit': [77984268000.0, 0.21261555583679398],
'income_from_co': [28651900000.0, 0.5683716810816832],
'income_from_co_before_it': [35768798000.0,
0.536472869131111],
'income_from_co_before_tax_si': [3302123000.0,
-0.18326588478789202],
'income_tax': [7116898000.0, 0.4201853191801001],
'interest_expense': [3459511000.0, -0.1769721855785621],
'interest_income': [None, None],
'marketing_selling_etc': [23805768000.0,
0.06327703424731415],
'net_income': [28651900000.0, 0.5683716810816832],
'net_income_atcss': [26970580000.0, 0.5808702054928813],
'net_income_atms_interest': [1681320000.0,
0.39185114911413654],
'net_interest_expense': [3459511000.0, -0.1769721855785621],
'operating_income': [31919063000.0, 0.6751497051555505],
'othr_revenues': [None, None],
'preferred_dividend': [0.0, None],
'rad_expenses': [19302418000.0, 0.04931054799005009],
'report_annual': 2020,
'report_date': 1609344000005,
'report_name': '2020年FY',
'report_type_code': 596001,
'revenue': [476978710000.0, 0.15440289651985575],
'sales_cost': [398994442000.0, 0.1436720014683004],
'sd': '2020-01-01',
'share_of_earnings_of_affiliate': [547612000.0,
2.004317651899602],
'total_basic_earning_common_ps': [6.32, 0.576059850374065],
'total_compre_income': [29147213000.0, 1.097230498820186],
'total_compre_income_atcss': [27440726000.0,
1.0911704872321188],
'total_compre_income_atms': [1706487000.0,
1.1997360038877551],
'total_dlt_earnings_common_ps': [6.17, 0.578005115089514],
'total_net_income_atcss': [26970580000.0,
0.5808702054928813],
'total_operate_expenses': [46065205000.0,
0.017872987914466523],
'total_operate_expenses_si': [-502492000.0,
-2.871095306361825],
'total_revenue': [476978710000.0, 0.15440289651985575]},
{'ctime': 1617756162000,
'ed': '2019-12-31',
'gross_profit': [64310793000.0, 0.05146478143258062],
'income_from_co': [18268565000.0, -0.33385504808890537],
'income_from_co_before_it': [23279811000.0,
-0.27108630126460664],
'income_from_co_before_tax_si': [4043082000.0,
-0.5491069252894362],
'income_tax': [5011246000.0, 0.11031160979747058],
'interest_expense': [4203395000.0, 0.17800199033641506],
'interest_income': [None, None],
'marketing_selling_etc': [22389055000.0,
0.14507365860389632],
'net_income': [18268565000.0, -0.33385504808890537],
'net_income_atcss': [17060591000.0, -0.34934699164069516],
'net_income_atms_interest': [1207974000.0,
0.0036441041286553208],
'net_interest_expense': [4203395000.0, 0.17800199033641506],
'operating_income': [19054454000.0, -0.18748112827671856],
'othr_revenues': [None, None],
'preferred_dividend': [0.0, None],
'rad_expenses': [18395334000.0, 0.22940460538165353],
'report_annual': 2019,
'report_date': 1577721600004,
'report_name': '2019年FY',
'report_type_code': 596001,
'revenue': [413182184000.0, 0.11342124122766711],
'sales_cost': [348871391000.0, 0.1256480464382964],
'sd': '2019-01-01',
'share_of_earnings_of_affiliate': [182275000.0,
1.379546647121047],
'total_basic_earning_common_ps': [4.01,
-0.3511326860841424],
'total_compre_income': [13897954000.0, -0.4769636105391466],
'total_compre_income_atcss': [13122185000.0,
-0.48782400909960205],
'total_compre_income_atms': [775769000.0,
-0.18444496307883804],
'total_dlt_earnings_common_ps': [3.91,
-0.35584843492586493],
'total_net_income_atcss': [17060591000.0,
-0.34934699164069516],
'total_operate_expenses': [45256339000.0,
0.20005272067674873],
'total_operate_expenses_si': [268555000.0,
1.7227322024958085],
'total_revenue': [413182184000.0, 0.11342124122766711]},
{'ctime': 1560942703000,
'ed': '2019-03-31',
'gross_profit': [11385000000.0, 0.09597612629957643],
'income_from_co': [2230000000.0, -0.053480475382003394],
'income_from_co_before_it': [2635000000.0,
-0.3021716101694915],
'income_from_co_before_tax_si': [1462000000.0,
4.601532567049809],
'income_tax': [405000000.0, -0.7147887323943662],
'interest_expense': [None, None],
'interest_income': [None, None],
'marketing_selling_etc': [5137000000.0, 0.5580831058538065],
'net_income': [2230000000.0, -0.053480475382003394],
'net_income_atcss': [2043000000.0, -0.025286259541984733],
'net_income_atms_interest': [187000000.0,
-0.28076923076923077],
'net_interest_expense': [966000000.0, 1.7058823529411764],
'operating_income': [1327000000.0, -0.6648143470573377],
'othr_revenues': [None, None],
'preferred_dividend': [0.0, None],
'rad_expenses': [3955000000.0, 0.4252252252252252],
'report_annual': 2019,
'report_date': 1553961600000,
'report_name': '2019年Q1',
'report_type_code': 596003,
'revenue': [88861000000.0, 0.3678077763753348],
'sales_cost': [77476000000.0, 0.4195463373520466],
'sd': '2019-01-01',
'share_of_earnings_of_affiliate': [-154000000.0,
0.6531531531531531],
'total_basic_earning_common_ps': [0.48,
-0.02040816326530614],
'total_compre_income': [None, None],
'total_compre_income_atcss': [None, None],
'total_compre_income_atms': [None, None],
'total_dlt_earnings_common_ps': [0.46,
-0.04166666666666659],
'total_net_income_atcss': [2043000000.0,
-0.025286259541984733],
'total_operate_expenses': [10058000000.0,
0.5644734795458081],
'total_operate_expenses_si': [None, None],
'total_revenue': [88861000000.0, 0.3678077763753348]},
{'ctime': 1617756162000,
'ed': '2018-12-31',
'gross_profit': [61163050000.0, 0.15987892878726956],
'income_from_co': [27424309000.0, 0.11949389028724264],
'income_from_co_before_it': [31937678000.0,
0.02956142491216258],
'income_from_co_before_tax_si': [8966831000.0,
0.28700510835977744],
'income_tax': [4513369000.0, -0.3081478134092464],
'interest_expense': [3568241000.0, 0.9829135301368052],
'interest_income': [None, None],
'marketing_selling_etc': [19552502000.0,
0.24008657043304116],
'net_income': [27424309000.0, 0.11949389028724264],
'net_income_atcss': [26220721000.0, 0.14906806875410045],
'net_income_atms_interest': [1203588000.0,
-0.2826994512917915],
'net_interest_expense': [3568241000.0, 0.9829135301368052],
'operating_income': [23451091000.0, -0.003255017899346956],
'othr_revenues': [None, None],
'preferred_dividend': [0.0, None],
'rad_expenses': [14962799000.0, 0.2737968808540811],
'report_annual': 2018,
'report_date': 1546185600004,
'report_name': '2018年FY',
'report_type_code': 596001,
'revenue': [371092421000.0, 0.2776851589186339],
'sales_cost': [309929371000.0, 0.3038187579796379],
'sd': '2018-01-01',
'share_of_earnings_of_affiliate': [-480244000.0,
-1.9133899600975308],
'total_basic_earning_common_ps': [6.18,
0.10554561717352413],
'total_compre_income': [26571677000.0, 0.33800831224856964],
'total_compre_income_atcss': [25620461000.0,
0.38309049519201155],
'total_compre_income_atms': [951216000.0,
-0.28751067367758754],
'total_dlt_earnings_common_ps': [6.07, 0.16955684007707125],
'total_net_income_atcss': [26220721000.0,
0.14906806875410045],
'total_operate_expenses': [37711959000.0,
0.29130215356164646],
'total_operate_expenses_si': [-371583000.0,
-2.4229614208334866],
'total_revenue': [371092421000.0, 0.2776851589186339]},
{'ctime': 1541759982000,
'ed': '2018-09-30',
'gross_profit': [42479000000.0, 0.11966577927726087],
'income_from_co': [20571000000.0, 0.15088956025511915],
'income_from_co_before_it': [24813000000.0,
0.07648590021691974],
'income_from_co_before_tax_si': [8885000000.0,
0.3284988038277512],
'income_tax': [4242000000.0, -0.1804482225656878],
'interest_expense': [None, None],
'interest_income': [None, None],
'marketing_selling_etc': [13734000000.0,
0.17124339075558587],
'net_income': [20571000000.0, 0.15088956025511915],
'net_income_atcss': [19816000000.0, 0.18361008242742802],
'net_income_atms_interest': [755000000.0,
-0.3330388692579505],
'net_interest_expense': [2147000000.0, 0.8669565217391304],
'operating_income': [15928000000.0, -0.02652487470969319],
'othr_revenues': [None, None],
'preferred_dividend': [0.0, None],
'rad_expenses': [10670000000.0, 0.22629582806573956],
'report_annual': 2018,
'report_date': 1538236800000,
'report_name': '2018年Q9',
'report_type_code': 596007,
'revenue': [257064000000.0, 0.24513332203143542],
'sales_cost': [214585000000.0, 0.2733805692041112],
'sd': '2018-01-01',
'share_of_earnings_of_affiliate': [None, None],
'total_basic_earning_common_ps': [4.67,
0.12259615384615379],
'total_compre_income': [None, None],
'total_compre_income_atcss': [None, None],
'total_compre_income_atms': [None, None],
'total_dlt_earnings_common_ps': [4.6, 0.2137203166226912],
'total_net_income_atcss': [19816000000.0,
0.18361008242742802],
'total_operate_expenses': [26551000000.0,
0.23052324234138202],
'total_operate_expenses_si': [None, None],
'total_revenue': [257064000000.0, 0.24513332203143542]}],
'org_type': 1,
'quote_name': '日月光半导体',
'sas': '国际会计准则',
'statuses': None,
'tip': '日月光半导体财年为每年的1月1日至12月31日,最新披露财报所属2020财年。'},
'error_code': 0,
'error_description': ''}