Home > database >  Is is possible to remove html tags by google app script in google sheet?
Is is possible to remove html tags by google app script in google sheet?

Time:06-16

I'm very new to google app script

So I was working on something like import google calendar's event to google sheet by google app script with this link

https://www.automatedstuff.com/tutorials/how-to-import-google-calendar-data-into-google-sheets-automatically/

And the following code is here

function export_gcal_to_gsheet(){

// Replace "[email protected]" with your own Google Calendar E-mail adress.
var mycal = "[email protected]";
var cal = CalendarApp.getCalendarById(mycal);

// Optional variations on getEvents
// var events = cal.getEvents(new Date("January 3, 2014 00:00:00 CST"), new Date("January 14, 2014 23:59:59 CST"));
var events = cal.getEvents(new Date("January 1, 2022 00:00:00 UTC"), new Date("December 31, 2022 23:59:59 UTC"));
  
var sheet = SpreadsheetApp.getActiveSheet();
// Uncomment this next line if you want to always clear the spreadsheet content before running - Note people could have added extra columns on the data though that would be lost
// sheet.clearContents();  

// Create a header record on the current spreadsheet in cells A1:N1 - Match the number of entries in the "header=" to the last parameter
// of the getRange entry below
var header = [["Event Title", "Event Description", "Event Start", "Event End", "Calculated Duration","Date Created", "Last Updated", "Created By"]]
var range = sheet.getRange(1,1,1,8);
range.setValues(header);

// Loop through all calendar events found and write them out starting on calulated ROW 2 (i 2)
for (var i=0;i<events.length;i  ) {
var row=i 2;
var myformula_placeholder = '';
  
// Matching the "header=" entry above, this is the detailed row entry "details=", and must match the number of entries of the GetRange entry below
// NOTE: I've had problems with the getVisibility for some older events not having a value, so I've had do add in some NULL text to make sure it does not error
var details=[[events[i].getTitle(), events[i].getDescription(),events[i].getStartTime(), events[i].getEndTime(), myformula_placeholder, events[i].getDateCreated(), events[i].getLastUpdated(), events[i].getCreators()]];
var range=sheet.getRange(row,1,1,8);
range.setValues(details);


// Writing formulas from scripts requires that you write the formulas separate from non-formulas
// Write the formula out for this specific row in column 7 to match the position of the field myformula_placeholder from above: foumula over columns F-E for time calc
var cell=sheet.getRange(row,5);
cell.setFormula('=mod((D'  row  ')-(C' row '),1)');
cell.setNumberFormat('[h]:mm');

}
}

I've tried few times before and everything was fine.

but somehow when I import description data from google calendar to google sheet, it will randomly import some data contains html tag from this(google calendar)

12345678
90ppl
apple
no child (tbc)
baby
deposit paid
local

to this(google sheet)

<u></u>12345678<br>90ppl<br>apple<br>no child (tbc)<br>baby<br>deposit paid<br>local<u></u>

So I want to figure out why would this happen?

Also is there any way to remove html tags by not using regex formula in google sheet before preceding codes was finished like literally just by code?

CodePudding user response:

Modification points:

  • In the description of the event in Calendar, the rich text is inputted as the HTML data. I think that the reason for your issue is due to this. About Also is there any way to remove HTML tags by not using regex formula in google sheet before preceding codes was finished like literally just by code?, in this answer, the regex is used in Google Apps Script instead of the built-in functions of Google Spreadsheet.
  • In your script, setValues, setFormula and set number format are used in the loop. In this case, the process cost will become high. Ref

When these points are reflected in your script, it becomes as follows.

Modified script:

function export_gcal_to_gsheet() {
  var mycal = "[email protected]";
  var cal = CalendarApp.getCalendarById(mycal);
  var events = cal.getEvents(new Date("January 1, 2022 00:00:00 UTC"), new Date("December 31, 2022 23:59:59 UTC"));
  var values = [["Event Title", "Event Description", "Event Start", "Event End", "Calculated Duration", "Date Created", "Last Updated", "Created By"]];
  for (var i = 0; i < events.length; i  ) {
    var description = events[i].getDescription().replace(/<br>|<\/li>/g, "\n").replace(/<. ?>/g, "").trim();
    values.push([events[i].getTitle(), description, events[i].getStartTime(), events[i].getEndTime(), null, events[i].getDateCreated(), events[i].getLastUpdated(), events[i].getCreators()]);
  }
  var sheet = SpreadsheetApp.getActiveSheet();
  sheet.getRange(1, 1, values.length, values[0].length).setValues(values);
  sheet.getRange(2, 5, sheet.getLastRow() - 1).setFormula('=mod((D2)-(C2),1)').setNumberFormat('[h]:mm');
}
  • Related