Home > front end >  Compare two string values, one of them being a tesseract output, the other a .txt file
Compare two string values, one of them being a tesseract output, the other a .txt file

Time:03-27

I have a program that uses tesseract to analyze an image taken as a screenshot from the computer. I also have a text file containing "F1 car Bahrain".

try
{
    var path = @"C:\source\repos\TEst1\packages\Tesseract.4.1.1";
    string LOG_PATH = "C:\\Desktop\\start.txt";

    var sourceFilePath = @"C:\\source\\repos\\Scrren\\Scrren\\bin\Debug\\TestImage.png";
    using (var engine = new TesseractEngine(path, "eng"))
    {
        using (var img = Pix.LoadFromFile(sourceFilePath))
        {
            using (var page = engine.Process(img))
            {
                var results = page.GetText();

                string WordsFrom = File.ReadAllText(LOG_PATH);
                string WordsFromList = WordsFrom.ToLower();

                string ScreenResult = results.ToLower().ToString();
                string Match = ScreenResult;    

                bool C = Match.Contains(WordsFromList);
                if (C)
                {
                    Console.WriteLine("Match");
                }
                else
                {
                    Console.WriteLine("No Match");
                }    
            }
        }
    }
}
catch (Exception e)
{
    Thread.Sleep(1500);
}

This code will give me an output of

"1 day ago cce sc ume f1 bahrain grand prix ~ start time, how ake nos video nea cea a] 8 reasons 2021 will go down in f1"

Obviously tesseract isn't perfect so some of it is jiberish, but the words f1 AND bahrain are in there, so I don't understand why bool C doesn't turn true. I am completely stumped and would appreciate the help greatly.

Printing the string "WordsFromList" to the console will show that it is correctly adding in both f1 and bahrain as well.

CodePudding user response:

See the comments in the code below:

using System.Text;

string searchFor = "F1 car Bahrain";
string searchIn = "1 day ago cce sc ume f1 bahrain grand prix ~ start time, how ake nos video nea cea a] 8 reasons 2021 will go down in f1";

// Returns false because there is no exact match for string "F1 car Bahrain" in the searchIn string.
Console.WriteLine($"Does {searchIn} contain {searchFor} => {searchIn.Contains(searchFor)}");

var words = searchFor.Split(' '); // Result is a string[] with 3 words ("F1", "car", "Bahrain").

// Returns false because 'car' is not in the input string. The 'All()' extension method only returns true if all words are matched.
Console.WriteLine($"Does {searchIn} contain {searchFor} => {words.All(word => searchIn.Contains(word, StringComparison.InvariantCultureIgnoreCase))}");
// Returns true because 'F1' or 'bahrain' are found in the input string. The 'Any()' extension method retuns true if any word matches.
Console.WriteLine($"Does {searchIn} contain {searchFor} => {words.Any(word => searchIn.Contains(word, StringComparison.InvariantCultureIgnoreCase))}");

Console.ReadKey();
  • Related