This Google Apps Script code to scrape press release news from Yahoo Finance randomly stopped working today. It suddenly gives the following error - TypeError: Cannot read properties of undefined (reading 'streams') (line 6)
function pressReleases(code) {
var url = 'https://finance.yahoo.com/quote/' code '/press-releases'
var html = UrlFetchApp.fetch(url).getContentText().match(/root.App.main = ([\s\S\w] ?);\n/);
if (!html || html.length == 1) return;
var obj = JSON.parse(html[1].trim());
var res = obj.context.dispatcher.stores.StreamStore.streams["YFINANCE:" code ".mega"].data.stream_items[0].title;
return res || "No value";
}
Code in Cell (with the stock symbol in cell A6)
=pressReleases(A6)
I can still retrieve JSON using python and the format of the data in the JSON is the exact same so I'm guessing it's a problem with Google Apps Script but I'm having no luck in fixing it.
The JSON output is here: https://privatebin.net/?4064ef5520f5b445#FDiJS868e3xSsgzh3y8LsF72LsefyoZ635kqCx62ZtwH
Any help as to why it suddenly stopped working would be appreciated.
CodePudding user response:
I believe match returns an array so perhaps you need
var html = UrlFetchApp.fetch(url).getContentText().match(/root.App.main = ([\s\S\w] ?);\n/)[index];
Oh I missed the fact that your using it in a conditional on the very next line and then in the next line you're using the index of one.
var obj = JSON.parse(html[1].trim());
so the problem must be either in a data variation or this line var res = obj.context.dispatcher.stores.StreamStore.streams["YFINANCE:" code ".mega"].data.stream_items[0].title;
CodePudding user response:
Your showing script is from this answer? When I saw the HTML, it seems that in the current stage, the data is converted with the salted base64. In this case, I would like to propose an answer by reflecting on the method of this answer.
Usage:
1. Get crypto-js.
Please access https://cdnjs.cloudflare.com/ajax/libs/crypto-js/4.1.1/crypto-js.min.js. And, copy and paste the script to the script editor of Google Apps Script, and save the script.
2. Modify script.
function pressReleases(code) {
var url = 'https://finance.yahoo.com/quote/' code '/press-releases'
var html = UrlFetchApp.fetch(url).getContentText().match(/root.App.main = ([\s\S\w] ?);\n/);
if (!html || html.length == 1) return;
var obj = JSON.parse(html[1].trim());
// --- I modified the below script.
const { _cs, _cr } = obj;
if (!_cs || !_cr) return;
const key = CryptoJS.algo.PBKDF2.create({ keySize: 8 }).compute(_cs, JSON.parse(_cr)).toString();
const obj2 = JSON.parse(CryptoJS.enc.Utf8.stringify(CryptoJS.AES.decrypt(obj.context.dispatcher.stores, key)));
var res = obj2.StreamStore.streams["YFINANCE:" code ".mega"].data.stream_items[0].title;
// ---
return res || "No value";
}
- When this script is used and
code
isPGEN
, the value ofPrecigen to Present at the 41st Annual J.P. Morgan Healthcare Conference
is obtained.
Note:
If you want to directly load
crypto-js
, you can also use the following script. But, in this case, the process cost becomes higher than that of the above flow. Please be careful about this.const cdnjs = "https://cdnjs.cloudflare.com/ajax/libs/crypto-js/4.1.1/crypto-js.min.js"; eval(UrlFetchApp.fetch(cdnjs).getContentText());
I can confirm that this method can be used for the current situation (December, 21, 2022). But, when the specification in the data and HTML is changed in the future update on the server side, this script might not be able to be used. Please be careful about this.