Now the problem is for a single Chinese character generated images generated by this CMD command box file "tesseract chi_sim. The song typeface. The. JPG chi_sim. The song typeface. - l chi_sim batch. Nochop makebox" will be submitted to the empty page this
, don't know is what reason, to find a lot of information on the Internet, also didn't find the specific reason, see the box after the next generation after the contents of the file, is actually "word" high to width X Y this content
, so in the generated box file, manually create a box type of document and then to write the content, the problem is cleared, to one step behind the important operation, use the CMD "tesseract chi_sim. The song typeface.. Tif chi_sim. The song typeface. Ground nobatch box. The" train ", is to generate the tr file, this time also need to use Chinese characters to generate images, the problem back to before, to generate the tr, will be submitted to the empty page,
Sometimes generate tr report this
Then I want to use the method of box before hand to write, but the inside of the tr content don't understand, can't write manually, before also didn't make OCR, is stuck in here, do you have any brothers met this kind of problem, I always doubt that generated the problem of the picture, because a single Chinese character images generated only 1-2 KB, don't understand, ah genuflect is begged
CodePudding user response:
Your sample pictures to white background and black textCodePudding user response:
The building Lord, come out?? RMB this character problem like you, I want to train alone, always empty pageCodePudding user response:
Leave! Not white background and black text, even didn't solve to dry, single letter anyway, single number can't identify,CodePudding user response:
Has been solved,Generated box with command: tesseract xx. Xx -- PSM 10 batch. Tif nochop makebox, key to see the PSM (parameters)
With Java code recognition: the instance. SetPageSegMode (TessPageSegMode. PSM_SINGLE_CHAR); (the key to see the source TessPageSegMode this excuse)
Hope to be able to help people to see later
CodePudding user response:
Why did I quote read_params_file: Can 't open 10, cry