Home > Back-end >  C# - Tesseract OCR: scan multiple language at once
C# - Tesseract OCR: scan multiple language at once

Time:11-28

Any idea about how to do it?

TesseractEngine engine = new TesseractEngine("./tessdata", "eng", EngineMode.Default);

Usually, for one language, just adding the abbreviation is enough. But how if I want to scan an image with multiple languages in it? Btw, I use the package by Charles Weld. Thanks.

CodePudding user response:

According to here, the syntax is supported, so you just need to add a sign like the following:

TesseractEngine engine = new TesseractEngine("./tessdata", "jpn eng", EngineMode.Default); // jpn eng for Japanese and English

Also, according to here:

The output can be different based on the order of languages, so -l eng hin can give different result than -l hin eng.

From what I can see, the language you specify first has better accuracy.

  • Related