Home > Net >  Google Apps Script to obtain the index if any of the words on a string match another string?
Google Apps Script to obtain the index if any of the words on a string match another string?

Time:02-23

I have a long list of roles obtained from a sheet range stored as strings in an array, to give an example the array looks something like this:

arr1 = ["football manager","hockey coach", "fb player","fb coach","footballer"];

and I have another array in which I have a small list of tags

arr2 = ["football","fb", "footballer","hockey","rugby"];

I am trying to match the roles of the first array to the tags on the second one.

I have been trying to do this by looping through and obtaining the index of the matched row:

for(let i in arr1){
arr2.findIndex(s => s.indexOf(arr1[i]) >= 0);
}

But this only works for "footballer" as it is an exact match, I need for all of the partial matches to be classified as well.

CodePudding user response:

Use following function to find indexes of tags (from arr2 array) that match values from arr1.

Follow code comments for detailed explanation.

function matchTagIndexes()
{
  // TODO replace with your values
  arr1 = ["football manager","hockey coach", "fb player","fb coach","footballer"];

  // TODO replace with your tags
  arr2 = ["football","fb", "footballer","hockey","rugby"];

  // for all tags create regex objects
  // regex searches for any match that have `tag` surrounded with word (\b) boundaries 
  // see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions/Cheatsheet#boundary-type_assertions
  const arr2Regexes = arr2.map(tag => new RegExp(`\\b${tag}\\b`, 'i'));

  // loop arr1 values as val
  arr1.map(val => 
    // for each arr2 regex match val
    arr2Regexes.forEach((regex, i) => 
      // if it is matched, log value from arr1 array, matched tag name and tag's index in arr2 array
      val.match(regex) && console.log(`"${val}" matches tag "${arr2[i]}" which has index ${i}`)
    )
  );
}

Result:

time status message
8:46:35 PM Notice Execution started
8:46:35 PM Info "football manager" matches tag "football" which has index 0
8:46:35 PM Info "hockey coach" matches tag "hockey" which has index 3
8:46:35 PM Info "fb player" matches tag "fb" which has index 1
8:46:35 PM Info "fb coach" matches tag "fb" which has index 1
8:46:35 PM Info "footballer" matches tag "footballer" which has index 2
8:46:36 PM Notice Execution completed

Reference:

CodePudding user response:

I suspect there could be several tags for every of the texts (arr1). Here is the solution to get the array of tags (indexes) for every of the texts:

var texts = ['football manager','hockey coach', 'fb player','fb coach','footballer', 'none'];
var tags = ['football','fb', 'footballer','hockey','rugby', 'coach'];

// get all tags for all the texts
var list = [];
for (let tag of tags) {
    var mask = RegExp('\\b'   tag   '\\b', 'i');
    for (let text of texts) {
        if (text.match(mask))
            list.push( {'text': text, 'tag': tag, 'tag_index': tags.indexOf(tag)} );
    }
}
console.log(list);

// group tags for the same texts
var text_and_tags = {};
for (let element of list) {
    try { text_and_tags[element.text].push(element.tag_index) }
    catch(e) { text_and_tags[element.text] = [element.tag_index] }
}
console.log(text_and_tags);

It will get you the object text_and_tags as follows:

{
  'football manager': [ 0 ],
  'fb player': [ 1 ],
  'fb coach': [ 1, 5 ],
  'footballer': [ 2 ],
  'hockey coach': [ 3, 5 ]
}
  • Related