I need to parse a Google Alert RSS Feed with Google Apps Script. Google Alerts RSS-Feed
I found a script which should do the job but I cant get it working with Google's RSS Feed:
The feed looks like this:
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:idx="urn:atom-extension:indexing">
<id>tag:google.com,2005:reader/user/06807031914929345698/state/com.google/alerts/10604166159629661594</id>
<title>Google Alert – garbe industrial real estate</title>
<link href="https://www.google.com/alerts/feeds/06807031914929345698/10604166159629661594" rel="self"/>
<updated>2022-03-17T19:34:28Z</updated>
<entry>
<id>tag:google.com,2013:googlealerts/feed:10523743457612307958</id>
<title type="html"><b>Garbe Industrial</b> plant Multi-User-Immobilie in Ludwigsfelde - <b>Property</b> Magazine</title>
<link href="https://www.google.com/url?rct=j&sa=t&url=https://www.property-magazine.de/garbe-industrial-plant-multi-user-immobilie-in-ludwigsfelde-117551.html&ct=ga&cd=CAIyGWRmNjU0ZGNkMzJiZTRkOWY6ZGU6ZGU6REU&usg=AFQjCNENveXYlfrPc7pZTltgXY8lEAPe4A"/>
<published>2022-03-17T19:34:28Z</published>
<updated>2022-03-17T19:34:28Z</updated>
<content type="html">Die <b>Garbe Industrial Real Estate</b> GmbH startet ihr drittes Neubauprojekt in der Metropolregion Berlin/Brandenburg. Der Projektentwickler hat sich ...</content>
<author>
...
</feed>
I want to extract entry -> id, title, link, updated, content.
I used this script:
function ImportFeed(url, n) {
var res = UrlFetchApp.fetch(url).getContentText();
var xml = XmlService.parse(res);
//var item = xml.getRootElement().getChild("channel").getChildren("item")[n - 1].getChildren();
var item = xml.getRootElement().getChildren("entry")[n - 1].getChildren();
var values = item.reduce(function(obj, e) {
obj[e.getName()] = e.getValue();
return obj;
}, {});
return [[values.id, values.title, values.link, values.updated, values.content]];
}
I modified this part, but all i got was "TypeError: Cannot read property 'getChildren' of undefined"
//var item = xml.getRootElement().getChild("channel").getChildren("item")[n - 1].getChildren();
var item = xml.getRootElement().getChildren("entry")[n - 1].getChildren();
Any idea is welcome!
CodePudding user response:
In your situation, how about the following modified script?
Modified script:
function SAMPLE(url, n = 1) {
var res = UrlFetchApp.fetch(url).getContentText();
var root = XmlService.parse(res.replace(/&/g, "&")).getRootElement();
var ns = root.getNamespace();
var entries = root.getChildren("entry", ns);
if (!entries || entries.length == 0) return "No values";
var header = ["id", "title", "link", "updated", "content"];
var values = header.map(f => f == "link" ? entries[n - 1].getChild(f, ns).getAttribute("href").getValue().trim() : entries[n - 1].getChild(f, ns).getValue().trim());
return [values];
}
In this case, when you use
getChild
andgetChildren
, please use the name space. I thought that this might be the reason of your issue.From your script, I guessed that you might use your script as the custom function. In that case, please modify the function name from
ImportFeed
to others, becauseIMPORTFEED
is a built-in function of Google Spreadsheet. In this sample,SAMPLE
is used.If you want to change the columns, please modify
header
.In this sample, the default value of
n
is1
. In this case, the 1st entry is retrieved.In this script, for example, you can put
=SAMPLE("URL", 1)
to a cell as the custom function. By this, the result value is returned.
Note:
If the above-modified script was not the direct solution of your issue, can you provide the sample value of
res
? By this, I would like to modify the script.As the additional information, when you want to put all values by executing the script with the script editor, you can also use the following script.
function myFunction() { var url = "###"; // Please set URL. var res = UrlFetchApp.fetch(url).getContentText(); var root = XmlService.parse(res.replace(/&/g, "&")).getRootElement(); var ns = root.getNamespace(); var entries = root.getChildren("entry", ns); if (!entries || entries.length == 0) return "No values"; var header = ["id", "title", "link", "updated", "content"]; var values = entries.map(e => header.map(f => f == "link" ? e.getChild(f, ns).getAttribute("href").getValue().trim() : e.getChild(f, ns).getValue().trim())); var sheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("Sheet1"); // Please set the sheet name. sheet.getRange(sheet.getLastRow() 1, 1, values.length, values[0].length).setValues(values); }