I'm fetching a Website, but all the Special Characters in the String from .getContentText() or .getContentText("UTF-8") are encoded as ’ and such. I've really run out of ideas, and to be honest don't quite understand at which point this Encoding happens. Thanks a lot for your help. I could solve it by "manually" replacing all the occurances, but that doesnt seem very clean.
var response = UrlFetchApp.fetch("https://podtail.com/de/top-podcasts/de/");
var html = response.getContentText();
CodePudding user response:
Your sample code suggests that you are retrieving the HTML source code of a specific page. That HTML source code uses ’
and friends, so the data will be in that format. It is unclear why you would need to decode those HTML entities.
If you really need to decode the HTML fully in Google Apps Script, you will need a parser of fairly respectable complexity. There are some shortcuts that you can try if your app has an HTML user interface of its own, but it would probably make more sense to use a library like the one by mathiasbynens.
If you only want to replace some HTML entities with their non-encoded equivalents, you may want to just use String.replace().