help with preparing an image for Tesseract OCR
-
loulou2522
- Enthusiast

- Posts: 553
- Joined: Tue Oct 14, 2014 12:09 pm
help with preparing an image for Tesseract OCR
I don't know how to improve the quality of an image in order to prepare an efficient Tesseract OCR ?
-
DarkDragon
- Addict

- Posts: 2347
- Joined: Mon Jun 02, 2003 9:16 am
- Location: Germany
- Contact:
Re: help with preparing an image for Tesseract OCR
Try a different size, tesseract prefers high resolution scans. Screenshots are usually too low res if that's your input.loulou2522 wrote: Wed Aug 28, 2024 4:16 pm I don't know how to improve the quality of an image in order to prepare an efficient Tesseract OCR ?
bye,
Daniel
Daniel
-
loulou2522
- Enthusiast

- Posts: 553
- Joined: Tue Oct 14, 2014 12:09 pm
Re: help with preparing an image for Tesseract OCR
In fact no my input comes from an PDF and after i treat that with PDFTOPNG and after i submit this image to tesseract like ;
First phasen
Second phase
third plase
First phasen
Code: Select all
RunProgram("cmd.exe", "/C "+Chr(34)+"Pdftopng -f "+firstpage+ " -l "+lastpage +" -gray -r 500 "+ file+ " bil"+Chr(34) ,"",#PB_Program_Wait|#PB_Program_Hide) Code: Select all
RunProgram("cmd.exe", "/C "+Chr(34)+"tesseract.exe BIL-000003.png essai -l fra --psm 12 -preserve_interword_spaces=1 --dpi 500 pdf" +Chr(34) ,"",#PB_Program_Wait|#PB_Program_Hide)Code: Select all
RunProgram("cmd.exe", "/C "+Chr(34)+"pdftotext -f 1 -l 1 -marginl 60 -enc UTF-8 -nopgbrk -table essai.pdf bilanactifscan.txt" +Chr(34) ,"",#PB_Program_Wait|#PB_Program_Hide)