Home > Blockchain >  look for the similarity of words in sentences javascript
look for the similarity of words in sentences javascript

Time:07-27

I am trying to find the similarity between the sentences in javascript, example :

let mastersentences = "Gambir, Kecamatan Gambir, Kota Jakarta Pusat, Daerah Khusus Ibukota Jakarta";
let keywordsentences = "DKI JAKARTA";
let result = ( mastersentences  === keywordsentences ) // true because 'jakarta' is match

the result of equating 'mastersentence' and 'keywordsentence' is "jakarta", i have try using more method of javascript but i still can't find the result like use :

includes()
indexOf()
lastIndexOf()
localeCompare()
match()
search()

CodePudding user response:

Since ALL previous answer got downvoted... I will take my chance.

My answer uses .match() with a regular expression made from the mastersentences which is previously cleaned from any punctuation or special characters (There could be some other I did not think of... Up to you to add them).

The output is a boolean false if no match were found.

const mastersentences = "Gambir, Kecamatan Gambir, Kota Jakarta Pusat, Daerah Khusus Ibukota Jakarta";
const keywordsentences = "DKI JAKARTA";
const noMatchSentence = "Hello world!"

function compare(master, sentence) {
  const words = sentence.replace(/[,\.:;\!\?'"\(\)\{\}~\|]/g, "").split(" ")
  const re = new RegExp(`\\b(${words.join('|')})\\b`, 'gi')
  return master.match(re) !== null
}

console.log(compare(mastersentences, keywordsentences)) // true
console.log(compare(mastersentences, noMatchSentence))  // false

CodePudding user response:

From the above comment ...

split the keywordsentences string value at every whitespace (sequence) into an array of words to be searched for in mastersentences. For the array of words search if some word is includes/ed within mastersentences.

In addition to the suggested all lower case based search via includes and some, one also could combine some with a RegExp-based case-insensitive test.

const mastersentence = "Gambir, Kecamatan Gambir, Kota Jakarta Pusat, Daerah Khusus Ibukota Jakarta";
const keywords = "DKI JAKARTA";

const lowerCaseMaster = mastersentence.toLowerCase();

console.log(
  keywords
    // - lower case `keywords` first.
    .toLowerCase()
    // - then split the lower cased string
    //   value at every whitespace sequence.
    .split(/\s /g)
    // - then look for at least a single occurrence 
    //   of a lower cased keyword item ...
    .some(keyword =>
      // ... within the lower cased master via `includes`.
      lowerCaseMaster.includes(keyword)
    )
);
console.log(
  keywords
    .split(/\s /g)
    .some(keyword =>
      // case insensitive regex based search.
      RegExp(`\\b${ keyword }\\b`, 'i').test(mastersentence)
    )
)

CodePudding user response:

You can split each sentence by \W which any character isn't text or number or underscore, and use some method, to check if there are any matches after converting each work to lower case to prevent sensitivity.

let mastersentences = "Gambir, Kecamatan Gambir, Kota Jakarta Pusat, Daerah Khusus Ibukota Jakarta";
let keywordsentences = "DKI JAKARTA";
const keywordsentencesArray = keywordsentences.toLowerCase()


const result = mastersentences.toLowerCase().split(/\W /).some(w => keywordsentencesArray.includes(w))

console.log(result)

CodePudding user response:

I assume, you want to get True with your matching, if any Keyword is found in the mastersentence.

This is done by splitting the Keywords from a string to a Array, as we can iterate through all Keywords this way. We then try to find any keyword in the sentence, in my case is the search case-insensitive. This means, it does not care if the Keyword was capitalized or all capital letters.

If any Keyword is found in the mastersentence we return a boolean value true and false otherwise.

// Our sentence to match the Keywords against
let mastersentences = "Gambir, Kecamatan Gambir, Kota Jakarta Pusat, Daerah Khusus Ibukota Jakarta";

// Our Keywords to match in a sentence
let keywordsentences = "DKI JAKARTA";


function matchKeyword(sentence, keywords){
  // It is easier to match the keywords against a senctence, if those are stored in an array
  if(typeof keywords === "string") keywords = keywords.split(" ");
  
  // We transform the sentence to all lowercase, so we can match keywords character-wise
  // If you dont want it to be case-insensitive, you should remove the '.toLowerCase()'-line 
  // here and further down by the keywords
  sentence = sentence.toLowerCase();
  
  // We check every keyword if it is anywhere in the sentence.
  // Arry.prototype.find() returns the first suffice keyword that exists in the sentence
  // With the double Exclamation mark we cast the result to a boolean value so we get true if any 
  // keyword was found and false if none were found in the sentence
  return !!keywords.find(keyword => sentence.indexOf(keyword.toLowerCase()) >= 0);
}


// We try matching the Keywords against the sentence
console.log(matchKeyword(mastersentences, keywordsentences))

CodePudding user response:

Filter and some

let master = "Gambir, Kecamatan Gambir, Kota Jakarta Pusat, Daerah Khusus Ibukota Jakarta".split(/,\s /);
let keywords = "DKI JAKARTA".split(/\s /);

console.log(
  master.filter(sentence => sentence.split(/\s /)
    .some(word => keywords.includes(word.toUpperCase()))
  )
  .length>0
)

  • Related