Home > front end >  GitHub Rest Api SHA of file that contains special characters
GitHub Rest Api SHA of file that contains special characters

Time:01-13

Main issue

This code works, but the areReallyEqual should not be necessary. In fact it is not necessary for files that don't contain special characters ("ñ", "é", "à", etc), so: all of my files except for 2 of them.

if (internalFileTextSha !== repositoryFile.sha) {
  // The sha might not be equal due to special characters in the text ("ñ", "é", "à", etc)... doing a second check...
  const remoteFileText = this._gitHubApi.fetchGitHubGetUrl(repositoryFile.download_url).getContentText(); 
  const remoteFileTextSha = GitHubCrypto.getSha(remoteFileText);
  const areReallyEqual = remoteFileTextSha === internalFileTextSha;
  if(!areReallyEqual) {
    // The file has been modified, creating a new commit with the updated file...
    this._gitHubApi.commitUpdatedFile(repositoryName, repositoryFile, internalFile.source);
  } else {
    // The second check determined that both files were equal
  }         
}

More details

internalFile comes from here:

  /**
  * @typedef {Object} InternalFile
  * @property {string} id - The ID of the internal file.
  * @property {string} name - The name of the internal file.
  * @property {string} type - The type of the internal file.
  * @property {string} source - The source code of the internal file.
  */

  /**
   * Gets all the internal files of a Google Apps Script file.
   *
   * @returns {InternalFile[]} An array of objects.
   */
  static getScriptInternalFiles(file) {
    // Check that the file is a Google Apps Script file
    if (file.getMimeType() == 'application/vnd.google-apps.script') {
      // Get the script content as a string
      const fileId = file.getId();
      const params = {
        headers: { 
          Authorization: 'Bearer '   ScriptApp.getOAuthToken(),
          'Accept-Charset': 'utf-8'
        },
        followRedirects: true,
        muteHttpExceptions: true,
      };
      const url =
        'https://script.google.com/feeds/download/export?id='
          fileId
          '&format=json';
      const response = UrlFetchApp.fetch(url, params);
      const json = JSON.parse(response);
      return json.files;  
    } else {
      throw new Error("The file is not a Google Apps Script file.");
    }
  }

...while remoteFile comes from here:

  /**
  * @typedef {Object} RepositoryFile
  * @property {string} name - The name of the file.
  * @property {string} path - The file path in the repository.
  * @property {string} sha - The SHA hash of the file.
  * @property {Number} size - The size of the file in bytes.
  * @property {string} url - The URL of the file's contents.
  * @property {string} html_url - The URL to the file's page on the repository website.
  * @property {string} git_url - The URL to the file's git contents.
  * @property {string} download_url - The URL to download the file.
  * @property {string} type - The type of the file, usually "file".
  */

  /**
   * Gets all the internal files of a Google Apps Script file.
   *
   * @returns {RepositoryFile[]} An array of objects.
   */
  listFilesInRepository(repositoryName) {
    let repositoryFiles = [];
    try {
      const options = {
        headers: {
          ...this._authorizationHeader,
        },
      };      
      const response = UrlFetchApp.fetch(`${this._buildRepositoryUrl(repositoryName)}/contents`, options);
      repositoryFiles = JSON.parse(response);
    } catch(e) {
      const errorMessage = GitHubApi._getErrorMessage(e);
      if(errorMessage === 'This repository is empty.') {
        // do nothing
      } else {
        // unknown error
        throw e;
      }
    }
    return repositoryFiles;
  }

...and the SHA calculation:

class GitHubCrypto {
  /**
   * @param {string} fileContent
   * @returns {string} SHA1 hash string
   */
  static getSha(fileContent) {
    // GitHub is computing the sum of `blob <length>\x00<contents>`, where `length` is the length in bytes of the content string and `\x00` is a single null byte.
    // For the Sha1 implementation, see: www.movable-type.co.uk/scripts/sha1.html
    const sha = Sha1.hash('blob '   fileContent.length   '\x00'   fileContent);
    return sha;
  }
}

CodePudding user response:

I believe your goal is as follows.

  • You want to retrieve the has value of 430f370909821443112064cd149a4bebd271bfc4 from a value of ñèàü using Google Apps Script.

In this case, how about the following sample script?

Sample script:

function myFunction() {
  const fileContent = 'ñèàü'; // This is from your comment.

  const value = 'blob '   fileContent.length   '\x00'   fileContent;
  const bytes = Utilities.computeDigest(Utilities.DigestAlgorithm.SHA_1, value, Utilities.Charset.UTF_8);
  const res = bytes.map(byte => ('0'   (byte & 0xFF).toString(16)).slice(-2)).join('');
  console.log(res); // 430f370909821443112064cd149a4bebd271bfc4
}
  • When this script is run, the has value of 430f370909821443112064cd149a4bebd271bfc4 is obtained.
  • When aaaa is directly used like const res = Utilities.computeDigest(Utilities.DigestAlgorithm.SHA_1, "aaaa", Utilities.Charset.UTF_8).map(byte => ('0' (byte & 0xFF).toString(16)).slice(-2)).join(''), 70c881d4a26984ddce795f6f71817c9cf4480e79 is obtained.

Reference:

CodePudding user response:

I have finally found the problem... the problem was in the "bytes length":

GitHub is computing the sum of blob <length>\x00<contents>, where length is the length in bytes of the content string and \x00 is a single null byte.

When there are no special characters, lengthInBytes == inputStr.length, but when there are special characters this is no longer true:

// For the Sha1 implementation, see: www.movable-type.co.uk/scripts/sha1.html

function simpleShaTest1() {
  const fileContent = 'ñèàü';
  const expectedSha = '56c9357fcf2589619880e1978deb8365454ece11';
  const sha = Sha1.hash('blob '   byteLength(fileContent)   '\x00'   fileContent);
  const areEqual = (sha === expectedSha); // true
}

function simpleShaTest2() {
  const fileContent = 'aaa';
  const expectedSha = '7c4a013e52c76442ab80ee5572399a30373600a2';
  const sha = Sha1.hash('blob '   lengthInUtf8Bytes(fileContent)   '\x00'   fileContent);
  const areEqual = (sha === expectedSha); // true
}

function lengthInUtf8Bytes(str) {
  // Matches only the 10.. bytes that are non-initial characters in a multi-byte sequence.
  var m = encodeURIComponent(str).match(/%[89ABab]/g);
  // `m` is `null` when there are no special characters, thus returning `m.length`
  return str.length   (m ? m.length : 0);
}
  • Related