I received 1 challenge through the interview test. The requirement of the test is to build a search engine that source data from a txt file, and every time the user enters a word, it will return the results.
The second requirement is:
Given a single word x, update the search corpus with x. The new word x should immediately be queryable.
3rd requirement is:
Given a single word y, remove the most similar word to y in the corpus from further search results.
I have never created a search engine before.
How can i create it with NodeJs and what is the meaning of the 2nd and 3rd requirements?
Thanks!
CodePudding user response:
Look into elasticsearch it is a very good use-case for these free text search kind of usecases. It uses lucene underneath to power these searches which uses something called as inverted index(read up about this structure for efficient queries on free text).
It has features for such free text query and fuzz matching(point 3 in your requirement)
Load the data into elasticsearch and write endpoints in your nodejs application layer which can
- query elasticsearch for text match on text fields
- perform fuzzy search
CodePudding user response:
There are lots of examples on this that you can easily read up on: https://www.google.com/search?q=build a search engine with node.js
2 = Update the search indexes (data that can be queried) with the with text x
3 = Find the word in your search indexes closest to the text y
and remove that word from the index so that it is not queryable in further queries
Note: Corpus definition -> https://en.wikipedia.org/wiki/Text_corpus