Home > front end >  GitHub Rest Api SHA of file that contains special characters
GitHub Rest Api SHA of file that contains special characters


Main issue

This code works, but the areReallyEqual should not be necessary. In fact it is not necessary for files that don't contain special characters ("ñ", "é", "à", etc), so: all of my files except for 2 of them.

if (internalFileTextSha !== repositoryFile.sha) {
  // The sha might not be equal due to special characters in the text ("ñ", "é", "à", etc)... doing a second check...
  const remoteFileText = this._gitHubApi.fetchGitHubGetUrl(repositoryFile.download_url).getContentText(); 
  const remoteFileTextSha = GitHubCrypto.getSha(remoteFileText);
  const areReallyEqual = remoteFileTextSha === internalFileTextSha;
  if(!areReallyEqual) {
    // The file has been modified, creating a new commit with the updated file...
    this._gitHubApi.commitUpdatedFile(repositoryName, repositoryFile, internalFile.source);
  } else {
    // The second check determined that both files were equal

More details

internalFile comes from here:

  * @typedef {Object} InternalFile
  * @property {string} id - The ID of the internal file.
  * @property {string} name - The name of the internal file.
  * @property {string} type - The type of the internal file.
  * @property {string} source - The source code of the internal file.

   * Gets all the internal files of a Google Apps Script file.
   * @returns {InternalFile[]} An array of objects.
  static getScriptInternalFiles(file) {
    // Check that the file is a Google Apps Script file
    if (file.getMimeType() == 'application/vnd.google-apps.script') {
      // Get the script content as a string
      const fileId = file.getId();
      const params = {
        headers: { 
          Authorization: 'Bearer '   ScriptApp.getOAuthToken(),
          'Accept-Charset': 'utf-8'
        followRedirects: true,
        muteHttpExceptions: true,
      const url =
      const response = UrlFetchApp.fetch(url, params);
      const json = JSON.parse(response);
      return json.files;  
    } else {
      throw new Error("The file is not a Google Apps Script file.");

...while remoteFile comes from here:

  * @typedef {Object} RepositoryFile
  * @property {string} name - The name of the file.
  * @property {string} path - The file path in the repository.
  * @property {string} sha - The SHA hash of the file.
  * @property {Number} size - The size of the file in bytes.
  * @property {string} url - The URL of the file's contents.
  * @property {string} html_url - The URL to the file's page on the repository website.
  * @property {string} git_url - The URL to the file's git contents.
  * @property {string} download_url - The URL to download the file.
  * @property {string} type - The type of the file, usually "file".

   * Gets all the internal files of a Google Apps Script file.
   * @returns {RepositoryFile[]} An array of objects.
  listFilesInRepository(repositoryName) {
    let repositoryFiles = [];
    try {
      const options = {
        headers: {
      const response = UrlFetchApp.fetch(`${this._buildRepositoryUrl(repositoryName)}/contents`, options);
      repositoryFiles = JSON.parse(response);
    } catch(e) {
      const errorMessage = GitHubApi._getErrorMessage(e);
      if(errorMessage === 'This repository is empty.') {
        // do nothing
      } else {
        // unknown error
        throw e;
    return repositoryFiles;

...and the SHA calculation:

class GitHubCrypto {
   * @param {string} fileContent
   * @returns {string} SHA1 hash string
  static getSha(fileContent) {
    // GitHub is computing the sum of `blob <length>\x00<contents>`, where `length` is the length in bytes of the content string and `\x00` is a single null byte.
    // For the Sha1 implementation, see: www.movable-type.co.uk/scripts/sha1.html
    const sha = Sha1.hash('blob '   fileContent.length   '\x00'   fileContent);
    return sha;

CodePudding user response:

I believe your goal is as follows.

  • You want to retrieve the has value of 430f370909821443112064cd149a4bebd271bfc4 from a value of ñèàü using Google Apps Script.

In this case, how about the following sample script?

Sample script:

function myFunction() {
  const fileContent = 'ñèàü'; // This is from your comment.

  const value = 'blob '   fileContent.length   '\x00'   fileContent;
  const bytes = Utilities.computeDigest(Utilities.DigestAlgorithm.SHA_1, value, Utilities.Charset.UTF_8);
  const res = bytes.map(byte => ('0'   (byte & 0xFF).toString(16)).slice(-2)).join('');
  console.log(res); // 430f370909821443112064cd149a4bebd271bfc4
  • When this script is run, the has value of 430f370909821443112064cd149a4bebd271bfc4 is obtained.
  • When aaaa is directly used like const res = Utilities.computeDigest(Utilities.DigestAlgorithm.SHA_1, "aaaa", Utilities.Charset.UTF_8).map(byte => ('0' (byte & 0xFF).toString(16)).slice(-2)).join(''), 70c881d4a26984ddce795f6f71817c9cf4480e79 is obtained.


CodePudding user response:

I have finally found the problem... the problem was in the "bytes length":

GitHub is computing the sum of blob <length>\x00<contents>, where length is the length in bytes of the content string and \x00 is a single null byte.

When there are no special characters, lengthInBytes == inputStr.length, but when there are special characters this is no longer true:

// For the Sha1 implementation, see: www.movable-type.co.uk/scripts/sha1.html

function simpleShaTest1() {
  const fileContent = 'ñèàü';
  const expectedSha = '56c9357fcf2589619880e1978deb8365454ece11';
  const sha = Sha1.hash('blob '   byteLength(fileContent)   '\x00'   fileContent);
  const areEqual = (sha === expectedSha); // true

function simpleShaTest2() {
  const fileContent = 'aaa';
  const expectedSha = '7c4a013e52c76442ab80ee5572399a30373600a2';
  const sha = Sha1.hash('blob '   lengthInUtf8Bytes(fileContent)   '\x00'   fileContent);
  const areEqual = (sha === expectedSha); // true

function lengthInUtf8Bytes(str) {
  // Matches only the 10.. bytes that are non-initial characters in a multi-byte sequence.
  var m = encodeURIComponent(str).match(/%[89ABab]/g);
  // `m` is `null` when there are no special characters, thus returning `m.length`
  return str.length   (m ? m.length : 0);
  • Related