Tesseract — text recognition engine

Share your advanced PureBasic knowledge/code with the community.
User avatar
Lunasole
Addict
Addict
Posts: 1091
Joined: Mon Oct 26, 2015 2:55 am
Location: UA
Contact:

Tesseract — text recognition engine

Post by Lunasole »

Hi.
Tesseract is powerful open-source text recognition engine. It uses neural networks and already is trained for many fonts, etc.

It is used for example in OpenCV framework [present here on forum by JHPJHP]. But there are cases when you don't want to take whole bloated OpenCV with you — then you can take whole bloated Tesseract :3 (it requires at least ~20mb database for neural networks... nothing to do).

The following bindings are from code I've used in some project, so they are not completed for 100% — rather for 10% and just doing "what is needed". That should be enough for everyone ^_^

Also, binaries included here (windows x86 .dll + .lib for it) as well as other required files and simple helloworld-example.
Have fun, for several more details check readme file.

You can download it here: http://geocities.ws/lunasole/
"W̷i̷s̷h̷i̷n̷g o̷n a s̷t̷a̷r"
AAT
Enthusiast
Enthusiast
Posts: 259
Joined: Sun Jun 15, 2008 3:13 am
Location: Russia

Re: Tesseract — text recognition engine

Post by AAT »

Hi,Lunasole!
Great example, thanks!

One question: why words are glued after recogniton?
For example: the result of the recognition
Image
is 4539087213453.
User avatar
Lunasole
Addict
Addict
Posts: 1091
Joined: Mon Oct 26, 2015 2:55 am
Location: UA
Contact:

Re: Tesseract — text recognition engine

Post by Lunasole »

AAT wrote: One question: why words are glued after recogniton?
For example: the result of the recognition
Image
is 4539087213453.
That should depend on page segmentation mode, or also can happen because of space character is filtered in config file.

So first try changing first param of TesseractInit procedure to some other #PSM_ constant.
Also edit list of allowed chars in config file, or just disable it (by setting last paramether of TessBaseAPIInit1 call to 0, or leaving config file empty)

There are also engine functions allowing to select areas on image and recognize them separately, but I didn't imported them (and some more functions are to work with words, etc).


I've used this stuff to recognize numbers as whole word only (one per file), so it was configured for that and later I just uploaded code "as is" with only removing all specific stuff like image preprocessing and additional filtering, etc. So anyway you need to configure such basic things depending of what is needed
"W̷i̷s̷h̷i̷n̷g o̷n a s̷t̷a̷r"
User avatar
Kwai chang caine
Always Here
Always Here
Posts: 5494
Joined: Sun Nov 05, 2006 11:42 pm
Location: Lyon - France

Re: Tesseract — text recognition engine

Post by Kwai chang caine »

Thanks Lunasole for sharing 8)
That's works great :D

I have try another picture and it's strange i have always characters numbers in return, is it normal :shock:

Image
The text on random_test.png is: 00111111211115112101501121'011 11'5 511011112 001111515551102 0511512 0011151112 0215
012511011 02 51125 11120 '2

1.5 5011111011 15 01115 51111012 21150102 2510215112 50021 5 11112 5921102 02
012511011 02 5112 11120, 11112 5921102 02 00111111111110511011, 1111 11102021105111,

210... 1725111112 5011111011 1115012 111515 0111110115 00111215 1125 011211

1311015 00111111211115112101501121'011 115 055 02 001111515551102 2105510211102
0110921 '2

1111211121 1290192 02 50111110115 2102 521111025 0111 0100052111 02 01221 025 51125
11120 511110121112111219151111121112111, 02 0111 251 11112 211021121112 5112111511112 51111
01001211125 0'51921115 210011115155511025.

13111101110'11111,12 5011115112 11151151021 211 051110111121 511112 52111102 11112011002 0111 5
511112 111011 51121111011 051121511011'112511115111121111125 51111012 0'1111115511011. 02
52111102 129101102 0215 01115 02 10 11111110115 02 5112511112111210122 2111251
01500111012 211 22 151191125.

00111111211105 111510112 '2

1.5 02111510112 25151111012, 05115 1111 0121111211211105, 12110222110115111512

5111 1111111110201100211 21 11150111122410115.
ImageThe happiness is a road...
Not a destination
Justin
Addict
Addict
Posts: 948
Joined: Sat Apr 26, 2003 2:49 pm

Re: Tesseract — text recognition engine

Post by Justin »

Hi Lunasole and others, i'm working in a c header converter wich is part of another project, it still needs some work in the type conversion but works pretty well for structs, imports and enums.
I tested it with the tesseract capi.h and this was the result, maybe is useful for you, some types will need to be checked but also outputs the source so it will be easier.
I'll post the converter when is finished.

Code: Select all

;File generated by PB Header Converter version: 1.0
;2016-07-25 12:07:53

;-> FUNCS
ImportC ""
	;TESS_API void  TESS_CALL TessDeleteText(char* text);
	;- TessDeleteText
	TessDeleteText.i(text.p-ascii)

	;TESS_API void  TESS_CALL TessDeleteTextArray(char** arr);
	;- TessDeleteTextArray
	TessDeleteTextArray.i(arr.i)

	;TESS_API void  TESS_CALL TessDeleteIntArray(int* arr);
	;- TessDeleteIntArray
	TessDeleteIntArray.i(arr.i)

	;TESS_API void  TESS_CALL TessDeleteBlockList(BLOCK_LIST* block_list);
	;- TessDeleteBlockList
	TessDeleteBlockList.i(block_list.i)

	;TESS_API void  TESS_CALL TessBaseAPIDelete(TessBaseAPI* handle);
	;- TessBaseAPIDelete
	TessBaseAPIDelete.i(handle.i)

	;TESS_API void  TESS_CALL TessBaseAPISetOutputName(TessBaseAPI* handle, const char* name);
	;- TessBaseAPISetOutputName
	TessBaseAPISetOutputName.i(handle.i, name.p-ascii)

	;TESS_API BOOL  TESS_CALL TessBaseAPISetVariable(TessBaseAPI* handle, const char* name, const char* value);
	;- TessBaseAPISetVariable
	TessBaseAPISetVariable.i(handle.i, name.p-ascii, value.p-ascii)

	;TESS_API BOOL  TESS_CALL TessBaseAPISetDebugVariable(TessBaseAPI* handle, const char* name, const char* value);
	;- TessBaseAPISetDebugVariable
	TessBaseAPISetDebugVariable.i(handle.i, name.p-ascii, value.p-ascii)

	;TESS_API BOOL  TESS_CALL TessBaseAPIGetDoubleVariable(const TessBaseAPI* handle, const char* name, double* value);
	;- TessBaseAPIGetDoubleVariable
	TessBaseAPIGetDoubleVariable.i(handle.i, name.p-ascii, value.i)

	;TESS_API const char*
	;               TESS_CALL TessBaseAPIGetStringVariable(const TessBaseAPI* handle, const char* name);
	;- TessBaseAPIGetStringVariable
	TessBaseAPIGetStringVariable.i(handle.i, name.p-ascii)

	;TESS_API BOOL  TESS_CALL TessBaseAPIPrintVariablesToFile(const TessBaseAPI* handle, const char* filename);
	;- TessBaseAPIPrintVariablesToFile
	TessBaseAPIPrintVariablesToFile.i(handle.i, filename.p-ascii)

	;TESS_API BOOL  TESS_CALL TessBaseAPIGetVariableAsString(TessBaseAPI* handle, const char* name, STRING* val);
	;- TessBaseAPIGetVariableAsString
	TessBaseAPIGetVariableAsString.i(handle.i, name.p-ascii, val.i)

	;TESS_API int   TESS_CALL TessBaseAPIInit(TessBaseAPI* handle, const char* datapath, const char* language,
	;                                         TessOcrEngineMode mode, char** configs, int configs_size,
	;                                         const STRING* vars_vec, size_t vars_vec_size,
	;                                         const STRING* vars_values, size_t vars_values_size, BOOL set_only_init_params);
	;- TessBaseAPIInit
	TessBaseAPIInit.i(handle.i, datapath.p-ascii, language.p-ascii, mode.i, configs.i, configs_size.i, vars_vec.i, vars_vec_size.i, vars_values.i, vars_values_size.i, set_only_init_params.i)

	;TESS_API int   TESS_CALL TessBaseAPIInit1(TessBaseAPI* handle, const char* datapath, const char* language, TessOcrEngineMode oem,
	;                                          char** configs, int configs_size);
	;- TessBaseAPIInit1
	TessBaseAPIInit1.i(handle.i, datapath.p-ascii, language.p-ascii, oem.i, configs.i, configs_size.i)

	;TESS_API int   TESS_CALL TessBaseAPIInit2(TessBaseAPI* handle, const char* datapath, const char* language, TessOcrEngineMode oem);
	;- TessBaseAPIInit2
	TessBaseAPIInit2.i(handle.i, datapath.p-ascii, language.p-ascii, oem.i)

	;TESS_API int   TESS_CALL TessBaseAPIInit3(TessBaseAPI* handle, const char* datapath, const char* language);
	;- TessBaseAPIInit3
	TessBaseAPIInit3.i(handle.i, datapath.p-ascii, language.p-ascii)

	;TESS_API const char*
	;               TESS_CALL TessBaseAPIGetInitLanguagesAsString(const TessBaseAPI* handle);
	;- TessBaseAPIGetInitLanguagesAsString
	TessBaseAPIGetInitLanguagesAsString.i(handle.i)

	;TESS_API char**
	;               TESS_CALL TessBaseAPIGetLoadedLanguagesAsVector(const TessBaseAPI* handle);
	;- TessBaseAPIGetLoadedLanguagesAsVector
	TessBaseAPIGetLoadedLanguagesAsVector.i(handle.i)

	;TESS_API char**
	;               TESS_CALL TessBaseAPIGetAvailableLanguagesAsVector(const TessBaseAPI* handle);
	;- TessBaseAPIGetAvailableLanguagesAsVector
	TessBaseAPIGetAvailableLanguagesAsVector.i(handle.i)

	;TESS_API int   TESS_CALL TessBaseAPIInitLangMod(TessBaseAPI* handle, const char* datapath, const char* language);
	;- TessBaseAPIInitLangMod
	TessBaseAPIInitLangMod.i(handle.i, datapath.p-ascii, language.p-ascii)

	;TESS_API void  TESS_CALL TessBaseAPIInitForAnalysePage(TessBaseAPI* handle);
	;- TessBaseAPIInitForAnalysePage
	TessBaseAPIInitForAnalysePage.i(handle.i)

	;TESS_API void  TESS_CALL TessBaseAPIReadConfigFile(TessBaseAPI* handle, const char* filename);
	;- TessBaseAPIReadConfigFile
	TessBaseAPIReadConfigFile.i(handle.i, filename.p-ascii)

	;TESS_API void  TESS_CALL TessBaseAPIReadDebugConfigFile(TessBaseAPI* handle, const char* filename);
	;- TessBaseAPIReadDebugConfigFile
	TessBaseAPIReadDebugConfigFile.i(handle.i, filename.p-ascii)

	;TESS_API void  TESS_CALL TessBaseAPISetPageSegMode(TessBaseAPI* handle, TessPageSegMode mode);
	;- TessBaseAPISetPageSegMode
	TessBaseAPISetPageSegMode.i(handle.i, mode.i)

	;TESS_API TessPageSegMode
	;               TESS_CALL TessBaseAPIGetPageSegMode(const TessBaseAPI* handle);
	;- TessBaseAPIGetPageSegMode
	TessBaseAPIGetPageSegMode.i(handle.i)

	;TESS_API char* TESS_CALL TessBaseAPIRect(TessBaseAPI* handle, const unsigned char* imagedata,
	;                                         int bytes_per_pixel, int bytes_per_line,
	;                                         int left, int top, int width, int height);
	;- TessBaseAPIRect
	TessBaseAPIRect.i(handle.i, imagedata.p-ascii, bytes_per_pixel.i, bytes_per_line.i, left.i, top.i, width.i, height.i)

	;TESS_API void  TESS_CALL TessBaseAPIClearAdaptiveClassifier(TessBaseAPI* handle);
	;- TessBaseAPIClearAdaptiveClassifier
	TessBaseAPIClearAdaptiveClassifier.i(handle.i)

	;TESS_API void  TESS_CALL TessBaseAPISetImage(TessBaseAPI* handle, const unsigned char* imagedata, int width, int height,
	;                                             int bytes_per_pixel, int bytes_per_line);
	;- TessBaseAPISetImage
	TessBaseAPISetImage.i(handle.i, imagedata.p-ascii, width.i, height.i, bytes_per_pixel.i, bytes_per_line.i)

	;TESS_API void  TESS_CALL TessBaseAPISetImage2(TessBaseAPI* handle, const PIX* pix);
	;- TessBaseAPISetImage2
	TessBaseAPISetImage2.i(handle.i, pix.i)

	;TESS_API void TESS_CALL TessBaseAPISetSourceResolution(TessBaseAPI* handle, int ppi);
	;- TessBaseAPISetSourceResolution
	TessBaseAPISetSourceResolution.i(handle.i, ppi.i)

	;TESS_API void  TESS_CALL TessBaseAPISetRectangle(TessBaseAPI* handle, int left, int top, int width, int height);
	;- TessBaseAPISetRectangle
	TessBaseAPISetRectangle.i(handle.i, left.i, top.i, width.i, height.i)

	;TESS_API void  TESS_CALL TessBaseAPISetThresholder(TessBaseAPI* handle, TessImageThresholder* thresholder);
	;- TessBaseAPISetThresholder
	TessBaseAPISetThresholder.i(handle.i, thresholder.i)

	;TESS_API BOXA* TESS_CALL TessBaseAPIGetConnectedComponents(TessBaseAPI* handle, PIXA** cc);
	;- TessBaseAPIGetConnectedComponents
	TessBaseAPIGetConnectedComponents.i(handle.i, cc.i)

	;TESS_API int   TESS_CALL TessBaseAPIGetThresholdedImageScaleFactor(const TessBaseAPI* handle);
	;- TessBaseAPIGetThresholdedImageScaleFactor
	TessBaseAPIGetThresholdedImageScaleFactor.i(handle.i)

	;TESS_API void  TESS_CALL TessBaseAPIDumpPGM(TessBaseAPI* handle, const char* filename);
	;- TessBaseAPIDumpPGM
	TessBaseAPIDumpPGM.i(handle.i, filename.p-ascii)

	;TESS_API TessPageIterator*
	;               TESS_CALL TessBaseAPIAnalyseLayout(TessBaseAPI* handle);
	;- TessBaseAPIAnalyseLayout
	TessBaseAPIAnalyseLayout.i(handle.i)

	;TESS_API int   TESS_CALL TessBaseAPIRecognize(TessBaseAPI* handle, ETEXT_DESC* monitor);
	;- TessBaseAPIRecognize
	TessBaseAPIRecognize.i(handle.i, monitor.i)

	;TESS_API int   TESS_CALL TessBaseAPIRecognizeForChopTest(TessBaseAPI* handle, ETEXT_DESC* monitor);
	;- TessBaseAPIRecognizeForChopTest
	TessBaseAPIRecognizeForChopTest.i(handle.i, monitor.i)

	;TESS_API char* TESS_CALL TessBaseAPIProcessPages(TessBaseAPI* handle, const char* filename, const char* retry_config,
	;                                                 int timeout_millisec);
	;- TessBaseAPIProcessPages
	TessBaseAPIProcessPages.i(handle.i, filename.p-ascii, retry_config.p-ascii, timeout_millisec.i)

	;TESS_API char* TESS_CALL TessBaseAPIProcessPage(TessBaseAPI* handle, PIX* pix, int page_index, const char* filename,
	;                                                const char* retry_config, int timeout_millisec);
	;- TessBaseAPIProcessPage
	TessBaseAPIProcessPage.i(handle.i, pix.i, page_index.i, filename.p-ascii, retry_config.p-ascii, timeout_millisec.i)

	;TESS_API TessResultIterator*
	;               TESS_CALL TessBaseAPIGetIterator(TessBaseAPI* handle);
	;- TessBaseAPIGetIterator
	TessBaseAPIGetIterator.i(handle.i)

	;TESS_API TessMutableIterator*
	;               TESS_CALL TessBaseAPIGetMutableIterator(TessBaseAPI* handle);
	;- TessBaseAPIGetMutableIterator
	TessBaseAPIGetMutableIterator.i(handle.i)

	;TESS_API char* TESS_CALL TessBaseAPIGetUTF8Text(TessBaseAPI* handle);
	;- TessBaseAPIGetUTF8Text
	TessBaseAPIGetUTF8Text.i(handle.i)

	;TESS_API char* TESS_CALL TessBaseAPIGetHOCRText(TessBaseAPI* handle, int page_number);
	;- TessBaseAPIGetHOCRText
	TessBaseAPIGetHOCRText.i(handle.i, page_number.i)

	;TESS_API char* TESS_CALL TessBaseAPIGetBoxText(TessBaseAPI* handle, int page_number);
	;- TessBaseAPIGetBoxText
	TessBaseAPIGetBoxText.i(handle.i, page_number.i)

	;TESS_API char* TESS_CALL TessBaseAPIGetUNLVText(TessBaseAPI* handle);
	;- TessBaseAPIGetUNLVText
	TessBaseAPIGetUNLVText.i(handle.i)

	;TESS_API int   TESS_CALL TessBaseAPIMeanTextConf(TessBaseAPI* handle);
	;- TessBaseAPIMeanTextConf
	TessBaseAPIMeanTextConf.i(handle.i)

	;TESS_API int*  TESS_CALL TessBaseAPIAllWordConfidences(TessBaseAPI* handle);
	;- TessBaseAPIAllWordConfidences
	TessBaseAPIAllWordConfidences.i(handle.i)

	;TESS_API BOOL  TESS_CALL TessBaseAPIAdaptToWordStr(TessBaseAPI* handle, TessPageSegMode mode, const char* wordstr);
	;- TessBaseAPIAdaptToWordStr
	TessBaseAPIAdaptToWordStr.i(handle.i, mode.i, wordstr.p-ascii)

	;TESS_API void  TESS_CALL TessBaseAPIClear(TessBaseAPI* handle);
	;- TessBaseAPIClear
	TessBaseAPIClear.i(handle.i)

	;TESS_API void  TESS_CALL TessBaseAPIEnd(TessBaseAPI* handle);
	;- TessBaseAPIEnd
	TessBaseAPIEnd.i(handle.i)

	;TESS_API int   TESS_CALL TessBaseAPIIsValidWord(TessBaseAPI* handle, const char *word);
	;- TessBaseAPIIsValidWord
	TessBaseAPIIsValidWord.i(handle.i, word.p-ascii)

	;TESS_API BOOL  TESS_CALL TessBaseAPIGetTextDirection(TessBaseAPI* handle, int* out_offset, float* out_slope);
	;- TessBaseAPIGetTextDirection
	TessBaseAPIGetTextDirection.i(handle.i, out_offset.i, out_slope.i)

	;TESS_API void  TESS_CALL TessBaseAPISetDictFunc(TessBaseAPI* handle, TessDictFunc f);
	;- TessBaseAPISetDictFunc
	TessBaseAPISetDictFunc.i(handle.i, f.i)

	;TESS_API void  TESS_CALL TessBaseAPISetProbabilityInContextFunc(TessBaseAPI* handle, TessProbabilityInContextFunc f);
	;- TessBaseAPISetProbabilityInContextFunc
	TessBaseAPISetProbabilityInContextFunc.i(handle.i, f.i)

	;TESS_API void  TESS_CALL TessBaseAPISetFillLatticeFunc(TessBaseAPI* handle, TessFillLatticeFunc f);
	;- TessBaseAPISetFillLatticeFunc
	TessBaseAPISetFillLatticeFunc.i(handle.i, f.i)

	;TESS_API BOOL  TESS_CALL TessBaseAPIDetectOS(TessBaseAPI* handle, OSResults* results);
	;- TessBaseAPIDetectOS
	TessBaseAPIDetectOS.i(handle.i, results.i)

	;TESS_API void  TESS_CALL TessBaseAPIGetFeaturesForBlob(TessBaseAPI* handle, TBLOB* blob, const DENORM* denorm, INT_FEATURE_ARRAY int_features,
	;                                                       int* num_features, int* FeatureOutlineIndex);
	;- TessBaseAPIGetFeaturesForBlob
	TessBaseAPIGetFeaturesForBlob.i(handle.i, blob.i, denorm.i, int_features.i, num_features.i, FeatureOutlineIndex.i)

	;TESS_API ROW*  TESS_CALL TessFindRowForBox(BLOCK_LIST* blocks, int left, int top, int right, int bottom);
	;- TessFindRowForBox
	TessFindRowForBox.i(blocks.i, left.i, top.i, right.i, bottom.i)

	;TESS_API void  TESS_CALL TessBaseAPIRunAdaptiveClassifier(TessBaseAPI* handle, TBLOB* blob, const DENORM* denorm, int num_max_matches,
	;                                                          int* unichar_ids, float* ratings, int* num_matches_returned);
	;- TessBaseAPIRunAdaptiveClassifier
	TessBaseAPIRunAdaptiveClassifier.i(handle.i, blob.i, denorm.i, num_max_matches.i, unichar_ids.i, ratings.i, num_matches_returned.i)

	;TESS_API const char*
	;               TESS_CALL TessBaseAPIGetUnichar(TessBaseAPI* handle, int unichar_id);
	;- TessBaseAPIGetUnichar
	TessBaseAPIGetUnichar.i(handle.i, unichar_id.i)

	;TESS_API const TessDawg*
	;               TESS_CALL TessBaseAPIGetDawg(const TessBaseAPI* handle, int i);
	;- TessBaseAPIGetDawg
	TessBaseAPIGetDawg.i(handle.i, i.i)

	;TESS_API int   TESS_CALL TessBaseAPINumDawgs(const TessBaseAPI* handle);
	;- TessBaseAPINumDawgs
	TessBaseAPINumDawgs.i(handle.i)

	;TESS_API ROW*  TESS_CALL TessMakeTessOCRRow(float baseline, float xheight, float descender, float ascender);
	;- TessMakeTessOCRRow
	TessMakeTessOCRRow.i(baseline.f, xheight.f, descender.f, ascender.f)

	;TESS_API TBLOB*
	;               TESS_CALL TessMakeTBLOB(Pix *pix);
	;- TessMakeTBLOB
	TessMakeTBLOB.i(pix.i)

	;TESS_API void  TESS_CALL TessNormalizeTBLOB(TBLOB *tblob, ROW *row, BOOL numeric_mode, DENORM *denorm);
	;- TessNormalizeTBLOB
	TessNormalizeTBLOB.i(tblob.i, row.i, numeric_mode.i, denorm.i)

	;TESS_API TessOcrEngineMode
	;               TESS_CALL TessBaseAPIOem(const TessBaseAPI* handle);
	;- TessBaseAPIOem
	TessBaseAPIOem.i(handle.i)

	;TESS_API void  TESS_CALL TessBaseAPIInitTruthCallback(TessBaseAPI* handle, TessTruthCallback *cb);
	;- TessBaseAPIInitTruthCallback
	TessBaseAPIInitTruthCallback.i(handle.i, cb.i)

	;TESS_API TessCubeRecoContext*
	;               TESS_CALL TessBaseAPIGetCubeRecoContext(const TessBaseAPI* handle);
	;- TessBaseAPIGetCubeRecoContext
	TessBaseAPIGetCubeRecoContext.i(handle.i)

	;TESS_API void  TESS_CALL TessBaseAPISetMinOrientationMargin(TessBaseAPI* handle, double margin);
	;- TessBaseAPISetMinOrientationMargin
	TessBaseAPISetMinOrientationMargin.i(handle.i, margin.i)

	;TESS_API void  TESS_CALL TessBaseGetBlockTextOrientations(TessBaseAPI* handle, int** block_orientation, bool** vertical_writing);
	;- TessBaseGetBlockTextOrientations
	TessBaseGetBlockTextOrientations.i(handle.i, block_orientation.i, vertical_writing.i)

	;TESS_API BLOCK_LIST*
	;               TESS_CALL TessBaseAPIFindLinesCreateBlockList(TessBaseAPI* handle);
	;- TessBaseAPIFindLinesCreateBlockList
	TessBaseAPIFindLinesCreateBlockList.i(handle.i)

	;TESS_API void  TESS_CALL TessPageIteratorDelete(TessPageIterator* handle);
	;- TessPageIteratorDelete
	TessPageIteratorDelete.i(handle.i)

	;TESS_API TessPageIterator*
	;               TESS_CALL TessPageIteratorCopy(const TessPageIterator* handle);
	;- TessPageIteratorCopy
	TessPageIteratorCopy.i(handle.i)

	;TESS_API void  TESS_CALL TessPageIteratorBegin(TessPageIterator* handle);
	;- TessPageIteratorBegin
	TessPageIteratorBegin.i(handle.i)

	;TESS_API BOOL  TESS_CALL TessPageIteratorNext(TessPageIterator* handle, TessPageIteratorLevel level);
	;- TessPageIteratorNext
	TessPageIteratorNext.i(handle.i, level.i)

	;TESS_API BOOL  TESS_CALL TessPageIteratorIsAtBeginningOf(const TessPageIterator* handle, TessPageIteratorLevel level);
	;- TessPageIteratorIsAtBeginningOf
	TessPageIteratorIsAtBeginningOf.i(handle.i, level.i)

	;TESS_API BOOL  TESS_CALL TessPageIteratorIsAtFinalElement(const TessPageIterator* handle, TessPageIteratorLevel level,
	;                                                          TessPageIteratorLevel element);
	;- TessPageIteratorIsAtFinalElement
	TessPageIteratorIsAtFinalElement.i(handle.i, level.i, element.i)

	;TESS_API BOOL  TESS_CALL TessPageIteratorBoundingBox(const TessPageIterator* handle, TessPageIteratorLevel level,
	;                                                     int* left, int* top, int* right, int* bottom);
	;- TessPageIteratorBoundingBox
	TessPageIteratorBoundingBox.i(handle.i, level.i, left.i, top.i, right.i, bottom.i)

	;TESS_API TessPolyBlockType
	;               TESS_CALL TessPageIteratorBlockType(const TessPageIterator* handle);
	;- TessPageIteratorBlockType
	TessPageIteratorBlockType.i(handle.i)

	;TESS_API PIX*  TESS_CALL TessPageIteratorGetBinaryImage(const TessPageIterator* handle, TessPageIteratorLevel level);
	;- TessPageIteratorGetBinaryImage
	TessPageIteratorGetBinaryImage.i(handle.i, level.i)

	;TESS_API PIX*  TESS_CALL TessPageIteratorGetImage(const TessPageIterator* handle, TessPageIteratorLevel level, int padding,
	;                                                  int* left, int* top);
	;- TessPageIteratorGetImage
	TessPageIteratorGetImage.i(handle.i, level.i, padding.i, left.i, top.i)

	;TESS_API BOOL  TESS_CALL TessPageIteratorBaseline(const TessPageIterator* handle, TessPageIteratorLevel level,
	;                                                  int* x1, int* y1, int* x2, int* y2);
	;- TessPageIteratorBaseline
	TessPageIteratorBaseline.i(handle.i, level.i, x1.i, y1.i, x2.i, y2.i)

	;TESS_API void  TESS_CALL TessPageIteratorOrientation(TessPageIterator* handle, TessOrientation *orientation,
	;                                                     TessWritingDirection *writing_direction, TessTextlineOrder *textline_order,
	;                                                     float *deskew_angle);
	;- TessPageIteratorOrientation
	TessPageIteratorOrientation.i(handle.i, orientation.i, writing_direction.i, textline_order.i, deskew_angle.i)

	;TESS_API void  TESS_CALL TessResultIteratorDelete(TessResultIterator* handle);
	;- TessResultIteratorDelete
	TessResultIteratorDelete.i(handle.i)

	;TESS_API TessResultIterator*
	;               TESS_CALL TessResultIteratorCopy(const TessResultIterator* handle);
	;- TessResultIteratorCopy
	TessResultIteratorCopy.i(handle.i)

	;TESS_API TessPageIterator*
	;               TESS_CALL TessResultIteratorGetPageIterator(TessResultIterator* handle);
	;- TessResultIteratorGetPageIterator
	TessResultIteratorGetPageIterator.i(handle.i)

	;TESS_API const TessPageIterator*
	;               TESS_CALL TessResultIteratorGetPageIteratorConst(const TessResultIterator* handle);
	;- TessResultIteratorGetPageIteratorConst
	TessResultIteratorGetPageIteratorConst.i(handle.i)

	;TESS_API char* TESS_CALL TessResultIteratorGetUTF8Text(const TessResultIterator* handle, TessPageIteratorLevel level);
	;- TessResultIteratorGetUTF8Text
	TessResultIteratorGetUTF8Text.i(handle.i, level.i)

	;TESS_API float TESS_CALL TessResultIteratorConfidence(const TessResultIterator* handle, TessPageIteratorLevel level);
	;- TessResultIteratorConfidence
	TessResultIteratorConfidence.f(handle.i, level.i)

	;TESS_API const char*
	;               TESS_CALL TessResultIteratorWordFontAttributes(const TessResultIterator* handle, BOOL* is_bold, BOOL* is_italic,
	;                                                              BOOL* is_underlined, BOOL* is_monospace, BOOL* is_serif,
	;                                                              BOOL* is_smallcaps, int* pointsize, int* font_id);
	;- TessResultIteratorWordFontAttributes
	TessResultIteratorWordFontAttributes.i(handle.i, is_bold.i, is_italic.i, is_underlined.i, is_monospace.i, is_serif.i, is_smallcaps.i, pointsize.i, font_id.i)

	;TESS_API BOOL  TESS_CALL TessResultIteratorWordIsFromDictionary(const TessResultIterator* handle);
	;- TessResultIteratorWordIsFromDictionary
	TessResultIteratorWordIsFromDictionary.i(handle.i)

	;TESS_API BOOL  TESS_CALL TessResultIteratorWordIsNumeric(const TessResultIterator* handle);
	;- TessResultIteratorWordIsNumeric
	TessResultIteratorWordIsNumeric.i(handle.i)

	;TESS_API BOOL  TESS_CALL TessResultIteratorSymbolIsSuperscript(const TessResultIterator* handle);
	;- TessResultIteratorSymbolIsSuperscript
	TessResultIteratorSymbolIsSuperscript.i(handle.i)

	;TESS_API BOOL  TESS_CALL TessResultIteratorSymbolIsSubscript(const TessResultIterator* handle);
	;- TessResultIteratorSymbolIsSubscript
	TessResultIteratorSymbolIsSubscript.i(handle.i)

	;TESS_API BOOL  TESS_CALL TessResultIteratorSymbolIsDropcap(const TessResultIterator* handle);
	;- TessResultIteratorSymbolIsDropcap
	TessResultIteratorSymbolIsDropcap.i(handle.i)
EndImport
;->

;-> ENUMS
;typedef enum TessOcrEngineMode     { OEM_TESSERACT_ONLY, OEM_CUBE_ONLY, OEM_TESSERACT_CUBE_COMBINED, OEM_DEFAULT } TessOcrEngineMode;
;- TessOcrEngineMode
Enumeration
	#OEM_TESSERACT_ONLY
	#OEM_CUBE_ONLY
	#OEM_TESSERACT_CUBE_COMBINED
	#OEM_DEFAULT
EndEnumeration

;typedef enum TessPageSegMode       { PSM_OSD_ONLY, PSM_AUTO_OSD, PSM_AUTO_ONLY, PSM_AUTO, PSM_SINGLE_COLUMN, PSM_SINGLE_BLOCK_VERT_TEXT,
;                                     PSM_SINGLE_BLOCK, PSM_SINGLE_LINE, PSM_SINGLE_WORD, PSM_CIRCLE_WORD, PSM_SINGLE_CHAR, PSM_COUNT } TessPageSegMode;
;- TessPageSegMode
Enumeration
	#PSM_OSD_ONLY
	#PSM_AUTO_OSD
	#PSM_AUTO_ONLY
	#PSM_AUTO
	#PSM_SINGLE_COLUMN
	#PSM_SINGLE_BLOCK_VERT_TEXT
	#PSM_SINGLE_BLOCK
	#PSM_SINGLE_LINE
	#PSM_SINGLE_WORD
	#PSM_CIRCLE_WORD
	#PSM_SINGLE_CHAR
	#PSM_COUNT
EndEnumeration

;typedef enum TessPageIteratorLevel { RIL_BLOCK, RIL_PARA, RIL_TEXTLINE, RIL_WORD, RIL_SYMBOL} TessPageIteratorLevel;
;- TessPageIteratorLevel
Enumeration
	#RIL_BLOCK
	#RIL_PARA
	#RIL_TEXTLINE
	#RIL_WORD
	#RIL_SYMBOL
EndEnumeration

;typedef enum TessPolyBlockType     { PT_UNKNOWN, PT_FLOWING_TEXT, PT_HEADING_TEXT, PT_PULLOUT_TEXT, PT_TABLE, PT_VERTICAL_TEXT,
;                                     PT_CAPTION_TEXT, PT_FLOWING_IMAGE, PT_HEADING_IMAGE, PT_PULLOUT_IMAGE, PT_HORZ_LINE, PT_VERT_LINE,
;                                     PT_NOISE, PT_COUNT } TessPolyBlockType;
;- TessPolyBlockType
Enumeration
	#PT_UNKNOWN
	#PT_FLOWING_TEXT
	#PT_HEADING_TEXT
	#PT_PULLOUT_TEXT
	#PT_TABLE
	#PT_VERTICAL_TEXT
	#PT_CAPTION_TEXT
	#PT_FLOWING_IMAGE
	#PT_HEADING_IMAGE
	#PT_PULLOUT_IMAGE
	#PT_HORZ_LINE
	#PT_VERT_LINE
	#PT_NOISE
	#PT_COUNT
EndEnumeration

;typedef enum TessOrientation       { ORIENTATION_PAGE_UP, ORIENTATION_PAGE_RIGHT, ORIENTATION_PAGE_DOWN, ORIENTATION_PAGE_LEFT } TessOrientation;
;- TessOrientation
Enumeration
	#ORIENTATION_PAGE_UP
	#ORIENTATION_PAGE_RIGHT
	#ORIENTATION_PAGE_DOWN
	#ORIENTATION_PAGE_LEFT
EndEnumeration

;typedef enum TessWritingDirection  { WRITING_DIRECTION_LEFT_TO_RIGHT, WRITING_DIRECTION_RIGHT_TO_LEFT, WRITING_DIRECTION_TOP_TO_BOTTOM } TessWritingDirection;
;- TessWritingDirection
Enumeration
	#WRITING_DIRECTION_LEFT_TO_RIGHT
	#WRITING_DIRECTION_RIGHT_TO_LEFT
	#WRITING_DIRECTION_TOP_TO_BOTTOM
EndEnumeration

;typedef enum TessTextlineOrder     { TEXTLINE_ORDER_LEFT_TO_RIGHT, TEXTLINE_ORDER_RIGHT_TO_LEFT, TEXTLINE_ORDER_TOP_TO_BOTTOM } TessTextlineOrder;
;- TessTextlineOrder
Enumeration
	#TEXTLINE_ORDER_LEFT_TO_RIGHT
	#TEXTLINE_ORDER_RIGHT_TO_LEFT
	#TEXTLINE_ORDER_TOP_TO_BOTTOM
EndEnumeration
;->

;-> SUMMARY
;FUNCS: 96
;STRUCTS: 0
;ENUMS: 7
;->
User avatar
Lunasole
Addict
Addict
Posts: 1091
Joined: Mon Oct 26, 2015 2:55 am
Location: UA
Contact:

Re: Tesseract — text recognition engine

Post by Lunasole »

Kwai chang caine wrote: I have try another picture and it's strange i have always characters numbers in return, is it normal :shock:
No, that's because of specified config file as I said ^^
It is configured to recognize numbers-only (tessdata\configs\char_filter.ini)
Leave that file empty to recognize all chars.

@Justin thanks, maybe someone will need that. I'm currently not planning to extend posted stuff
"W̷i̷s̷h̷i̷n̷g o̷n a s̷t̷a̷r"
AAT
Enthusiast
Enthusiast
Posts: 259
Joined: Sun Jun 15, 2008 3:13 am
Location: Russia

Re: Tesseract — text recognition engine

Post by AAT »

Hi!
Kwai chang caine, you have to try:
1. download france dictionary file https://github.com/tesseract-ocr/tessda ... raineddata
and put it into \tessdata
2. replace code

Code: Select all

	Procedure TesseractInit(PageSegMode, OEM)
		Protected hApi = TessBaseAPICreate()
		Protected CFG$ = "";ToAscii(@"char_filter.ini")
		Protected CFG = @CFG$ ; pointer to a pointer (we passing array of strings with file names of tesseract configs to use)
		
		If Not hAPI
			Debug "Failed to create Tesseract!"
			ProcedureReturn 0
		EndIf
		
		; init tesseract to recognize numbers only
		TessBaseAPISetPageSegMode(hApi, PageSegMode) ; I'm not sure should page seg mode be set before init, or after :) so it is here twice ...
;		If TessBaseAPIInit1(hAPI, GetPathPart(ProgramFilename()) + "\tessdata", "eng", OEM, @CFG, 1)
		If TessBaseAPIInit1(hAPI, GetPathPart(ProgramFilename()) + "\tessdata", "fra", OEM, @CFG, 1)		
			Debug "Failed to init Tesseract!"
			TessBaseAPIEnd(hAPI)
			ProcedureReturn 0
		EndIf
		TessBaseAPISetPageSegMode(hApi, PageSegMode) ; 2nd
		ProcedureReturn hAPI
	EndProcedure
3. replace code

Code: Select all

	Procedure Main()
		; init tesseract
;	  Protected hAPI = TesseractInit(#PSM_RAW_LINE, #OEM_TESSERACT_ONLY)
	  Protected hAPI = TesseractInit(#PSM_AUTO, #OEM_TESSERACT_ONLY)
		Debug RecognizeText(hAPI, "texte1.png")
		
		; cleanup on quit
		If hAPI
			TessBaseAPIEnd(hAPI)
		EndIf
	EndProcedure
and you will get
Comment faire lorsque l‘on n'a aucune connaissance dans le domaine dela
création de sites web ?

La solution la plus simple et rapide est de faire appel a une agence de
création de site web, une agence de communication, un indépendant,

etc... C‘est une solution viable mais qui vous coûtera tres cher!

Alors comment faire lorsque l‘on n'a pas de connaissance et pas (peu) de
budget ?

Internet regorge de solutions et de services qui proposent de creer des sites
web simplement et gratuitement, ce qui est une excellente alternative aux
problèmes d'argents et connaissances.

Aujourd'hui, je souhaite m'attarder en particulier sur le service Webnode qui a
attiré mon attention par le fait qu’il est vraiment tres simple d'utilisation. Ce
service regroupe deja plus de 10 millions de sites internet crée et il est
disponible en 22 langues.

Comment ça marche ?

La démarche est simple, dans un premier temps, rendez—vousjuste

sur http://www.webnode1r et inscrivez—vous
Lunasole, thanks one more time!
Last edited by AAT on Mon Jul 25, 2016 1:10 pm, edited 2 times in total.
User avatar
Kwai chang caine
Always Here
Always Here
Posts: 5494
Joined: Sun Nov 05, 2006 11:42 pm
Location: Lyon - France

Re: Tesseract — text recognition engine

Post by Kwai chang caine »

Thanks Lunasole and AAT for your explanation 8)

That's works very well now
Thanks a lot for this useful code LUNASOLE 8)
ImageThe happiness is a road...
Not a destination
marcos.exe
User
User
Posts: 21
Joined: Fri Jan 17, 2020 8:20 pm

Re: Tesseract — text recognition engine

Post by marcos.exe »

Hello!

I'm blind, and I need to detect the text on the screen, and the coordinate of that text.
It would also be good, if in addition to the coordinate, it also gave the height and width of the text.
Why?
Some texts are detectable by my screen reader, but the limitation does not place the cursor over them.
These texts, often present in some programs, are clickable. However, when giving the command for the cursor to stay over it to be clicked, I simply have no response.
I know this post is old, which is why the link to this library is no longer working.
I have tesseract properly installed on my PC with Windows 7 64bits.
Would anyone know how I could be doing this?

Thanks for any help!
When our generation/OS updates, we either update ourselves, or we are removed.
But we are never fully uninstalled.
AAT
Enthusiast
Enthusiast
Posts: 259
Joined: Sun Jun 15, 2008 3:13 am
Location: Russia

Re: Tesseract — text recognition engine

Post by AAT »

Hi marcos.exe
This is the example how to get
- iteration level
- x,y, width, height of text block
- cofidence
in Tesseract.

You can dowload archieve 00_Tess_Confidence.zip with the example and all needed libs: https://disk.yandex.ru/d/TmuWRJZ4kkxPug
Test pictures in 00_Tess_Confidence\binaries\images
SpaceBar - switch between page iterator level

All libs are 32 bit.

Tested 6 years ago, Windows XP 32 bit and tested for now in Windows 11 64 bit with PureBasic 32 bit.

P.S. Please, don't send me PM, place all your questions here.

Code: Select all

; OpenCV + Tesseract
; How to get confidence
; AAT, 2017

IncludeFile "includes/cv_functions.pbi"
IncludeFile "includes/tesseract.pbi"

Global IteratorLevel.l=0
Global  *hAPI, *image.IplImage, *imgcopy.IplImage, exitCV.b, lpPrevWndFunc

#CV_WINDOW_NAME = "OpenCV + Tesseract OCR"
#CV_DESCRIPTION = "Confidence of OCR" + Chr(10) + 
                  "- SPACEBAR: Switch between page iterator level." 

ProcedureC WindowCallback(hWnd, Msg, wParam, lParam)
  If Msg = #WM_DESTROY
      exitCV = #True
  EndIf
  ProcedureReturn CallWindowProc_(lpPrevWndFunc, hWnd, Msg, wParam, lParam)
EndProcedure

ProcedureC GetConfidence(ILevel)
  Protected *gray.IplImage, *bin.IplImage, *boxes.BOXA, *box.box
  
  cvReleaseImage(@*imagecopy)
  *imgcopy = cvCloneImage(*image)
  *gray = cvCreateImage(*imgcopy\width, *imgcopy\height, #IPL_DEPTH_8U, 1)
  *bin = cvCreateImage(*imgcopy\width, *imgcopy\height, #IPL_DEPTH_8U, 1)
  If *imgcopy\nChannels = 3
    cvCvtColor(*imgcopy, *gray, #CV_BGR2GRAY, 1)
  Else
    *gray = cvCloneImage(*imgcopy)
  EndIf  
  threshold.d = cvThreshold(*gray, *bin, 10, 255, #CV_THRESH_OTSU)    
  TessBaseAPISetImage(*hApi, *bin\imageData, *bin\width, *bin\height, 1, *bin\widthStep)
  *boxes = TessBaseAPIGetComponentImages(*hAPI, ILevel, 1, #Null, #Null)
  For i = 0 To *boxes\n - 1
    *box = boxaGetBox(*boxes, i, #L_CLONE); 
    TessBaseAPISetRectangle(*hAPI, *box\x, *box\y, *box\w, *box\h)
    ocrResult$ = RTrim(PeekS(TessBaseAPIGetUTF8Text(*hAPI), -1, #PB_UTF8),Chr(10))
    conf.f = TessBaseAPIMeanTextConf(*hAPI)
    
    Select IteratorLevel
      Case #RIL_BLOCK 
        ILevel$ = "RIL_BLOCK:  Block of text/image/separator line"
      Case #RIL_PARA  
        ILevel$ = "RIL_PARA :  Paragraph within a block"
      Case #RIL_TEXTLINE   
        ILevel$ = "RIL_TEXTLINE :  Line within a paragraph"        
      Case #RIL_WORD  
        ILevel$ = "RIL_WORD :  Word within a textline"        
      Case #RIL_SYMBOL 
        ILevel$ = "RIL_SYMBOL :  Symbol/character within a word"        
    EndSelect                   
    Debug(#LFCR$+"_____Iterator Level="+ILevel$)
    Debug ocrResult$ + " : CONFIDENCE=" + Str(conf) + " : box_x="+Str(*box\x)+ ", box_width="+Str(*box\w)+" ; "+" : box_y="+Str(*box\y)+ ", box_height="+Str(*box\h)
    
    Select conf
      Case 0 To 64
        cvRectangleR(*imgcopy, *box\x - 1, *box\y - 1, *box\w + 2, *box\h + 2, 0, 0, 255, 0, 1, #CV_AA, #Null)
      Case 65 To 77
        cvRectangleR(*imgcopy, *box\x - 1, *box\y - 1, *box\w + 2, *box\h + 2, 0, 255, 255, 0, 1, #CV_AA, #Null)
      Case 78 To 100
        cvRectangleR(*imgcopy, *box\x - 1, *box\y - 1, *box\w + 2, *box\h + 2, 0, 255, 0, 0, 1, #CV_AA, #Null)
      Default
        cvRectangleR(*imgcopy, *box\x - 1, *box\y - 1, *box\w + 2, *box\h + 2, 255, 0, 0, 0, 1, #CV_AA, #Null)
    EndSelect  
  Next  
  cvReleaseImage(@*gray)
  cvReleaseImage(@*bin)   
EndProcedure


ImageFile.s = OpenFileRequester("Choose an image file", "", "All Images (*.*)|*.bmp;*.dib;*.jpeg;*.jpg;*.jpe;*.png;*.tiff;*.tif", position)
*image.IplImage = cvLoadImage(ImageFile, #CV_LOAD_IMAGE_ANYDEPTH | #CV_LOAD_IMAGE_ANYCOLOR)  

*hAPI = TesseractInit(#PSM_AUTO, #OEM_TESSERACT_ONLY, GetPathPart(ProgramFilename())+"tessdata", "eng", "")
GetConfidence(IteratorLevel)

cvNamedWindow(#CV_WINDOW_NAME, #CV_WINDOW_AUTOSIZE)
window_handle = cvGetWindowHandle(#CV_WINDOW_NAME)
*window_name = cvGetWindowName(window_handle)
lpPrevWndFunc = SetWindowLongPtr_(window_handle, #GWL_WNDPROC, @WindowCallback())
hWnd = GetParent_(window_handle)
opencv = LoadImage_(GetModuleHandle_(0), @"icons/opencv.ico", #IMAGE_ICON, 35, 32, #LR_LOADFROMFILE)
SendMessage_(hWnd, #WM_SETICON, 0, opencv)
wStyle = GetWindowLongPtr_(hWnd, #GWL_STYLE)
SetWindowLongPtr_(hWnd, #GWL_STYLE, wStyle & ~(#WS_MAXIMIZEBOX | #WS_MINIMIZEBOX | #WS_SIZEBOX))	
cvMoveWindow(#CV_WINDOW_NAME, 20, 20)
ToolTip(window_handle, #CV_DESCRIPTION)

cvShowImage(#CV_WINDOW_NAME, *imgcopy)
Repeat  
  keyPressed = cvWaitKey(10)  
  
  If  keyPressed = 32 
    IteratorLevel + 1
    If IteratorLevel > #RIL_SYMBOL
      IteratorLevel = #RIL_BLOCK
    EndIf   
    
    GetConfidence(IteratorLevel)
    cvShowImage(#CV_WINDOW_NAME, *imgcopy)
  EndIf      
          
Until keyPressed = 27 Or exitCV

cvReleaseImage(@*image)
cvReleaseImage(@*imagecopy)
If *hAPI
	TessBaseAPIEnd(*hAPI)
EndIf	

End
loulou2522
Enthusiast
Enthusiast
Posts: 542
Joined: Tue Oct 14, 2014 12:09 pm

Re: Tesseract — text recognition engine

Post by loulou2522 »

Hi all
this link is broken https://disk.yandex.ru/d/TmuWRJZ4kkxPug can someone give me a right link to downlaod thens
marcos.exe
User
User
Posts: 21
Joined: Fri Jan 17, 2020 8:20 pm

Re: Tesseract — text recognition engine

Post by marcos.exe »

Hello!
I'm sorry, but I only just saw your message.
I immediately went to download it, but I couldn't!
It's saying the file has been removed!
Could you please repost???
grateful!
When our generation/OS updates, we either update ourselves, or we are removed.
But we are never fully uninstalled.
acreis
Enthusiast
Enthusiast
Posts: 204
Joined: Fri Jun 01, 2012 12:20 am

Re: Tesseract — text recognition engine

Post by acreis »

Me too
Thanks
AAT
Enthusiast
Enthusiast
Posts: 259
Joined: Sun Jun 15, 2008 3:13 am
Location: Russia

Re: Tesseract — text recognition engine

Post by AAT »

Hi.
I updated the archive link. You can dowтload archieve 00_Tess_Confidence.zip with the example and all needed libs:
https://disk.yandex.ru/d/xMM0eLId0ud4TA

Test pictures in 00_Tess_Confidence\binaries\images
SpaceBar - switch between page iterator level, watch the debug data.

Check compiler options:
https://disk.yandex.ru/i/mPBLAeC0MRbILw
loulou2522
Enthusiast
Enthusiast
Posts: 542
Joined: Tue Oct 14, 2014 12:09 pm

Re: Tesseract — text recognition engine

Post by loulou2522 »

Thanks AAT forthe downloading,
Can you help me to treat an image with table ? I dont'know how to do
Have a nice day
Post Reply