help with preparing an image for Tesseract OCR
Posted: Wed Aug 28, 2024 4:16 pm
I don't know how to improve the quality of an image in order to prepare an efficient Tesseract OCR ?
http://www.purebasic.com
https://www.purebasic.fr/english/
Try a different size, tesseract prefers high resolution scans. Screenshots are usually too low res if that's your input.loulou2522 wrote: Wed Aug 28, 2024 4:16 pm I don't know how to improve the quality of an image in order to prepare an efficient Tesseract OCR ?
Code: Select all
RunProgram("cmd.exe", "/C "+Chr(34)+"Pdftopng -f "+firstpage+ " -l "+lastpage +" -gray -r 500 "+ file+ " bil"+Chr(34) ,"",#PB_Program_Wait|#PB_Program_Hide) Code: Select all
RunProgram("cmd.exe", "/C "+Chr(34)+"tesseract.exe BIL-000003.png essai -l fra --psm 12 -preserve_interword_spaces=1 --dpi 500 pdf" +Chr(34) ,"",#PB_Program_Wait|#PB_Program_Hide)Code: Select all
RunProgram("cmd.exe", "/C "+Chr(34)+"pdftotext -f 1 -l 1 -marginl 60 -enc UTF-8 -nopgbrk -table essai.pdf bilanactifscan.txt" +Chr(34) ,"",#PB_Program_Wait|#PB_Program_Hide)