Anyone able to test my barcode-ocr?

Developed or developing a new product in PureBasic? Tell the world about it.
User avatar
Kukulkan
Addict
Addict
Posts: 1396
Joined: Mon Jun 06, 2005 2:35 pm
Location: germany
Contact:

Anyone able to test my barcode-ocr?

Post by Kukulkan »

Hi,

I wrote a DLL in PB to recognize and read barcodes in a given bitmap (BMP, JPG, PNG, TIF).

I'm looking for testers of this dll. I need to know:

- is it working (found any barcodes)?
- how many dpi and colors the picture had?
- has the barcode been rotated?


I think, 300DPI and B/W will be the minimum for recognition.

Actually, the dll can recognice three types of barcodes:
- Code39
- Code128 incl. checksum-calculation
- EAN13 incl. checksum-calculation

The barcodes have to be horizontal in the picture (or rotated by 180°). The y can be rotated by +- 5° to 10°.

Here is the download-link:
http://www.x-beliebig.info/Download/INBarcodeOCR.zip

Here is a quick documentation:

Code: Select all

; INBarcodeOCR documentation
;
; The DLL provides the following functions:
;
; GetBarcodesFile(Filename)
; --------------------------------------------------------------------
; Retrieves the barcodes in a given image-file (BMP, JPG, PNG, TIF)
; Returns the number of barcodes found
; Returns -1 for an error (file not found)
; Returns -2 if not registered

; GetBarcodesClipboard()
; --------------------------------------------------------------------
; Retrieves the barcodes in a given image in clipboard
; Returns the number of barcodes found
; Returns -1 for an error (no image in clipboard)
; Returns -2 if not registered

; GetBarcodesResult()
; --------------------------------------------------------------------
; Returns the result of the barcode OCR. Returns the following string:
; X TAB Y TAB Width TAB Height TAB Codetype TAB Code CR
; X TAB Y TAB Width TAB Height TAB Codetype TAB Code CR
; TAB = ASCII 9
; CR  = ASCII 13

; GetBarcodeVersionInfo()
; --------------------------------------------------------------------
; Returns the version of this DLL

; RegisterBarcodeDLL(Password)
; --------------------------------------------------------------------
; Registers this DLL using a password you got after purchase.
; Returns 0 for success
; Returns -1 for failure
Here is a PB4 example call: (use debugger to see the output)

Code: Select all

; Test for INBarcodeOCR.dll

Filename.s = "c:\Bild.bmp"

If OpenLibrary(0, "INBarcodeOCR.dll")
  ; show version information
  Debug "Version: " + PeekS(CallFunction(0, "GetBarcodeVersionInfo"))
 
  ; register the dll
  If CallFunction(0, "RegisterBarcodeDLL", "beta") = 0
 
    ; start OCR and find all barcodes
    Ret.l = CallFunction(0, "FindBarcodesFile", Filename.s)
   
    If Ret.l > 0
      ; show the result
      Result.s = PeekS(CallFunction(0, "GetBarcodesResult"))
      Debug "Barcode result: " + Result.s
    EndIf
   
    If Ret.l = 0
      Debug "No barcodes found"
    EndIf
   
    If Ret.l < 0
      Debug "Error " + Str(Ret.l)
    EndIf
 
  Else

    Debug "Wrong password" ; not correct registered
 
  EndIf
  CloseLibrary(0)
EndIf
Who needs a VB6 example can ask for...

The password for this DLL is "beta".

I look forward to get some feedback.

Kukulkan
r_hyde
Enthusiast
Enthusiast
Posts: 155
Joined: Wed Jul 05, 2006 12:40 am

Post by r_hyde »

This is a great start! I did some really quick testing and have found some things that raise questions. But first, some of the information you're looking for:

1. I'm testing strictly on 1-bit TIFFs at 300DPI
2. It is successfully finding barcodes, though not always reading them correctly (more on that below). My test images are all very consistent as to barcode location and size, though. I'd like to test with more varied size and placement.
3. My test images are straight from a production scanning pipeline, and so have already been adjusted for rotation and cropping. The result is that all the barcodes are nearly perfectly horizontal with very little (<2°) rotation.

Now to my questions/observations:

- It seems to work with extended characters on an inconsistent basis, so I assume Code39 Extended support is intended. Is this correct?

- If Code39 Extended is supported, then it seems inconsistent. If I create a perfect image of a barcode using Photoshop and a Code39 font, I can coax your dll to read the alpha character set pretty well. But it doesn't recognize the 'special' characters (+, -, %, /) allowed in the specification. Or at least, it doesn't report them correctly - it outputs a seemingly arbitrary substitute in the special character's position. On my test images which are not as perfect as the Photoshop example, every single barcode is read incorrectly. They are recognized as Code39, but the output string is garbage. For instance, if my barcode reads "ADJ-BTCH" the output might be "P27Z8M651" (as it appears in one of my tests). Except for the "P", it is the same length as the expected string, which is interesting because in some of my Photoshop-image tests I noticed that the start/stop character (*) sometimes appears in the output as "P". I can provide any of my test images for you to examine, upon request.

- On my test images, the barcode bounds are roughly X:1400 Y:75 Width:950 Height:150; your dll is detecting the X, Y, and width correctly, but reporting a height of more than 3 times the actual height of the true barcode boundary. Might this be interfering with proper decoding?

That is all that I have so far; I will be doing more testing as opportunity provides. As you have built in a registration method to your dll, I presume you will be making this a commercially available product at some point. If so, may I inquire as to what the price point will be? If the beta goes well and I can begin to see consistent, quality results in testing then I would probably be interested in a license if the price was right.

Anyway, as I said, this is a great and promising start, and I look forward to your response to my feedback!
r_hyde
Enthusiast
Enthusiast
Posts: 155
Joined: Wed Jul 05, 2006 12:40 am

Post by r_hyde »

One more thought - is it possible to query only the barcode strings instead of getting the full position/type/value string for each barcode? Though it's not difficult to just parse out the needed parts, it would be better IMO if there were separate functions to query these attributes. For instance:

GetBarcodeString(barcode.l) ;for returning the decoded string
GetBarcodeType(barcode.l) ;for returning the barcode type (Code39, etc.)
GetBarcodeMetrics(barcode.l) ;for returning the X,Y,Width,Height
(or even separate GetBarcodeX, GetBarcodeY, GetBarcodeWidth, GetBarcodeHeight functions)

where the barcode.l parameter is the index to the desired barcode (0 for 1st barcode, 1 for 2nd, etc.). It's just a suggestion; the way it works now is fine, with just a little extra work on my part to extract the needed info.
User avatar
Kukulkan
Addict
Addict
Posts: 1396
Joined: Mon Jun 06, 2005 2:35 pm
Location: germany
Contact:

Post by Kukulkan »

Hi r_hyde,

Thank you very much for your tests and suggestions. I will try to answer from top to bottom... :wink:
- It seems to work with extended characters on an inconsistent basis, so I assume Code39 Extended support is intended. Is this correct?
I support only the standard code39 (0-9, A-Z, SPACE, -, ., $, /, +, %). Currently I don't support the extended version. Code39 offers the possibility to do a checksum, too. But currently code39 is the least supported code of my DLL :( I don't know how to know if the checksum is coded or not. And actually I have no examples for the extended code39. Maybe you can send me some examples with 300DPI and some code39 and code39 extended barcodes? But I need the information what is encoded, too.
- On my test images, the barcode bounds are roughly X:1400 Y:75 Width:950 Height:150; your dll is detecting the X, Y, and width correctly, but reporting a height of more than 3 times the actual height of the true barcode boundary. Might this be interfering with proper decoding?
I dont care about the height of an barcode. Actually I search a line with a barcode. Then I check the left and right boundarys (enough whitespace for the quiet area?). The height is always 20% of the width. So, if your barcode is very small height, the rectangle may be to big. If your barcode is very high (like on some bottles) the rectangle is only a part of the whole barcode. Now, this area is scanned in steps of 10% from top to bottom. The first time I can read a correct barcode I use this result and stop processing this barcode.

Thus, the coordinates are only for the developer to decide on which position of the page the barcode is located. It is not an exact rectangle around the barcode.
If so, may I inquire as to what the price point will be?
Yes, it will be a commercial product. Actually I don't know the prices, but it will be cheap against other products. I think about 20,- EUR per runtime-licence and, maybe, 600,- EUR for a runtime-free licence. But this numbers are not fixed yet.
It's just a suggestion; the way it works now is fine, with just a little extra work on my part to extract the needed info.
Here is a little example how easy it is:

Code: Select all

; show the result
Result.s = PeekS(CallFunction(0, "GetBarcodesResult"))
EntryID.l = 1
Repeat
  Entry.s = StringField(Result.s, EntryID.l, Chr(13))
  If Entry.s <> ""
    CodeType.s = StringField(Entry.s, 5, Chr(9))
    CodeString.s = StringField(Entry.s, 6, Chr(9))
    Debug "Code: " + CodeString.s + "  Type: " + CodeType.s
  EndIf
  EntryID.l = EntryID.l + 1
Until Entry = ""
I will add another function for recognizing a given barcode bitmap. So, if you have a small bitmap with only a barcode inside ( no other things around), you can call this function. This will be in the next beta release after testing with some more images and codes.

I will send you a private message with my mailaddress. Maybe you can send me some images of your productive system?

Regards,

Kukulkan
r_hyde
Enthusiast
Enthusiast
Posts: 155
Joined: Wed Jul 05, 2006 12:40 am

Post by r_hyde »

Thanks for the answers!

Looks like I had my facts about Code39 wrong - you are already supporting what I thought was the extended specification. The actual extended specification supposedly allows all 128 ASCII characters, and I'm pretty sure I have no need for that. So disregard any mention of Code39 Extended :wink:

I didn't suspect that the height of the search box was affecting the reading of the barcodes, but I had to ask because of some of the strange results I'm seeing, where a barcode is found correctly but not decoded correctly. I think you'll see what I mean when I send you some examples.

Thanks for the early info on pricing. I can wait patiently for 'real' figures until purchasing becomes an option 8)

having a special function for reading an isolated barcode image would be neat, because in many of my use cases I know precisely the region which will contain the barcode. That should save processing time, I guess.

I will send some example images and output to your email soon. I am working with potentially sensitive customer documents, so I will need to go through and sanitize them before I can provide them to you. Look for something in your mailbox in the next couple of days.
User avatar
Kukulkan
Addict
Addict
Posts: 1396
Joined: Mon Jun 06, 2005 2:35 pm
Location: germany
Contact:

Post by Kukulkan »

Hi r_hyde,
having a special function for reading an isolated barcode image would be neat, because in many of my use cases I know precisely the region which will contain the barcode. That should save processing time, I guess.
Yes, this will speed up a little. In this case the barcode has to be completely isolated from the rest. But in case you have a fixed document-format this option may be great.
I will send some example images and output to your email soon. I am working with potentially sensitive customer documents, so I will need to go through and sanitize them before I can provide them to you. Look for something in your mailbox in the next couple of days.
I can't wait... :wink: give me work!

Kukulkan
User avatar
Kukulkan
Addict
Addict
Posts: 1396
Joined: Mon Jun 06, 2005 2:35 pm
Location: germany
Contact:

Post by Kukulkan »

Thanks of roger I have been able to enhance and debug the library. I updated the zip-file of the first post:

New:
- fixed Code39 bugs
- added optional code39 checksum calculation
- better position-recognition
- methods to decode single barcode-bitmaps
- documentation as pdf

For information about the functions and implementing in VB6 and PureBasic, please have a look at the documentation inside of the zip-file.

It would be very nice if someone can test it with Code128 or EAN13...

Good testers will get a free licence of the final version...

Kukulkan
ABBKlaus
Addict
Addict
Posts: 1143
Joined: Sat Apr 10, 2004 1:20 pm
Location: Germany

Post by ABBKlaus »

Test was succesfull 8)

I hope i deserve a free license for that :wink:

On the testpage are the following codes :

1. EAN8 = 5512345
2. EAN13 = 123456789012
3. UPC-A = 34109876534
4. UPC-E = 123456
5. Code128 = EP12345678A1234
6. Code39 = TEST39 (+Checksum=Q)
7. Code39Full = test39Full
8. Code93 = TEST93
9. Code2of5 = 12345670
10. 2Digit.Supplement = 12
11. 5Digit.Supplement = 99999
12. EAN13 with 5Digit.Supplement (ISBN) = 978389011528
The ISBN is DR.DOS 5.0 from Data-Becker
13. Codabar = (StartA) 12345 (EndD)

Code: Select all

; Test for INBarcodeOCR.dll 

Filename.s = "INBarcodeOCR_Test.bmp" 

If OpenLibrary(0, "INBarcodeOCR.dll") 
  ; show version information 
  Debug "Version: " + PeekS(CallFunction(0, "GetBarcodeVersionInfo")) 
  
  ; register the dll 
  If CallFunction(0, "RegisterBarcodeDLL", "beta") = 0 
  
    ; start OCR and find all barcodes 
    Ret.l = CallFunction(0, "FindBarcodesFile", Filename.s) 
    
    If Ret.l > 0 
      ; show the result 
      Result.s = PeekS(CallFunction(0, "GetBarcodesResult")) 
      Repeat
        EntryID + 1
        Entry.s = StringField(Result.s, EntryID.l, Chr(13)) 
        If Entry.s <> "" 
          X.s = StringField(Entry.s, 1, Chr(9))
          Y.s = StringField(Entry.s, 2, Chr(9))
          W.s = StringField(Entry.s, 3, Chr(9))
          H.s = StringField(Entry.s, 4, Chr(9))
          CodeType.s = StringField(Entry.s, 5, Chr(9))
          CodeString.s = StringField(Entry.s, 6, Chr(9))
          Debug "---------------------------------------------------------"
          Debug Str(EntryID)+".Code : " + CodeType.s
          Debug "X : "+ X
          Debug "Y : "+ Y
          Debug "W : "+ W
          Debug "H : "+ H
          Debug "Text : " + CodeString.s
        EndIf 
      Until Entry = ""      
    EndIf 
    
    If Ret.l = 0 
      Debug "No barcodes found" 
    EndIf 
    
    If Ret.l < 0 
      Debug "Error " + Str(Ret.l) 
    EndIf 
  
  Else 

    Debug "Wrong password" ; not correct registered 
  
  EndIf 
  CloseLibrary(0) 
EndIf 
Result :
Version: V0.8.35 BETA
---------------------------------------------------------
1.Code : EAN13
X : 558
Y : 455
W : 576
H : 107
Text : 1234567 890128
---------------------------------------------------------
2.Code : EAN13
X : 558
Y : 575
W : 576
H : 107
Text : 1234567 890128
---------------------------------------------------------
3.Code : EAN13
X : 558
Y : 695
W : 576
H : 107
Text : 0341098 765342
---------------------------------------------------------
4.Code : EAN13
X : 558
Y : 815
W : 576
H : 107
Text : 0341098 765342
---------------------------------------------------------
5.Code : CODE128
X : 558
Y : 1175
W : 956
H : 183
Text : EP12345678A1234
---------------------------------------------------------
6.Code : CODE39
X : 558
Y : 1415
W : 686
H : 129
Text : TEST39Q
---------------------------------------------------------
7.Code : EAN13
X : 558
Y : 2825
W : 856
H : 163
Text : 9783890 115283
The Barcodes :
http://www.purebasicpower.de/downloads/ ... R_Test.bmp
http://www.purebasicpower.de/downloads/ ... R_Test.jpg
Image
Regards Klaus
r_hyde
Enthusiast
Enthusiast
Posts: 155
Joined: Wed Jul 05, 2006 12:40 am

Post by r_hyde »

Great sheet for reference, Klaus!

To test the library well, there needs to be some real-world chaos in the image(s) (skew, speckling/noise, stray handwritten marks through the barcodes, etc.). Otherwise all you're doing is testing the best-case scenario.

Given my occupation in a high-volume scanning bureau, I have been trying to leverage my access to the large variety of barcode sizes and shapes out there in the real world, and giving feedback so that Kukulkan can be sure his library is robust and production-ready. My challenge, though, is that most of my company's customers seem to use code39 almost exclusively :(

So what I'm thinking of doing next, is printing out several copies of Klaus's excellent ref sheet, stacking them up and abusing them in the way stacks of paper get abused through typical mailroom handling, and scanning them at a range of DPI. It's amazing how much effect paper quality & handling have on the final scanned images. This should be a very rigorous test for the dll, I think!

Now I'll just need to carve out some more spare time...
User avatar
Kukulkan
Addict
Addict
Posts: 1396
Joined: Mon Jun 06, 2005 2:35 pm
Location: germany
Contact:

Post by Kukulkan »

Hi Roger, Klaus,

Thank you very much for the tests. :D
I hope i deserve a free license for that
Yes, I will provide both of you with a free licence if the library is completed. Actually I think of a special offer for Roger, because he has done a really great job in testing, generating test-images and doing a great test-tool... 8)

I improved the library to avoid duplicates of the same barcode twice. Included in the updated zip-file (see first post) there is the test-tool of roger. I enhanced it to process bmp, tif, jpg and png images. Thanks again to Roger for this great tool.

Kukulkan
r_hyde
Enthusiast
Enthusiast
Posts: 155
Joined: Wed Jul 05, 2006 12:40 am

Post by r_hyde »

I'm flattered, honestly! My thanks go to you for the opportunity to test what may be, for me, a very timely piece of software. I would not turn down a free license (woot!), but you can be certain that I would offer you a donation in return for your generosity 8)

I'm really excited about the improvements I've seen already, in just the short time I've been testing the software! I haven't had a chance to really go over the documentation you included, but I hope to do it soon. In the meantime, I have continued with the plan I outlined in my last post: I have a batch of around 130 images consisting of barcodes of varying sizes, lengths and encodings, and I am scanning them at different resolutions so that I can run some more controlled tests.

Initially, things are looking great: the only barcodes that aren't reading are the ones I least expect to get results from, due to poor overall image quality. It's definitely an acceptable margin from what I've seen so far.

In this next phase I would like to start looking at improving the speed of detection*, as I have roughly benchmarked it at 1.5 seconds per letter-size (TIFF) image at 200DPI, increasing to almost 3 seconds per image at 300DPI on my 2.6GHz workstation at the office. Those numbers add up when the ultimate intent is to process tens of thousands of images per day! I sometimes wish I weren't forced to work with TIFF--it's remarkably slower in PB than other image formats.


*I mean by using clever image slicing techniques to feed smaller "chunks" to INBarcodeOCR.dll for processing; I don't mean to imply that your library is slow. Although if you happen to speed it up, I certainly won't complain;)
r_hyde
Enthusiast
Enthusiast
Posts: 155
Joined: Wed Jul 05, 2006 12:40 am

Post by r_hyde »

In the documentation, it says (under features) that the dll can read barcodes rotated by 180°. After doing some testing, I have noticed that it does read them if they're upside-down, but not correctly. Is this a feature you have planned but have not yet implemented? I ask because upside-down forms are an occasional reality for us (my company), and it would be quite a benefit if we could still detect and read the barcodes on them. I got excited when I read that it was a feature!
User avatar
Kukulkan
Addict
Addict
Posts: 1396
Joined: Mon Jun 06, 2005 2:35 pm
Location: germany
Contact:

Post by Kukulkan »

Hi Roger,
In the documentation, it says (under features) that the dll can read barcodes rotated by 180°. After doing some testing, I have noticed that it does read them if they're upside-down, but not correctly. Is this a feature you have planned but have not yet implemented?
Uh, I found a little error in the Code39 test-algorithm. That caused the wrong recognition (rotated Code128 and EAN13 worked fine). Now I found another solution that is also much faster. So you should recognize increased performance and correctly detected rotated Code39 codes. :D

(ZIP file updated!)

Kukulkan
User avatar
Kukulkan
Addict
Addict
Posts: 1396
Joined: Mon Jun 06, 2005 2:35 pm
Location: germany
Contact:

Post by Kukulkan »

Another update (yes, I'm actually doing nothing else...).

new version:

- better finding barcodes
- now recognizes EAN8 and UTC-A, too

Download using URL in the first post of this thread!

Regards,

Kukulkan
ABBKlaus
Addict
Addict
Posts: 1143
Joined: Sat Apr 10, 2004 1:20 pm
Location: Germany

Post by ABBKlaus »

Hi,

is getting better every day :wink:

Results for the new version are:
Version: V0.8.48 BETA
---------------------------------------------------------
1.Code : EAN8
X : 367
Y : 155
W : 336
H : 59
Text : 5512 3457
---------------------------------------------------------
2.Code : EAN13
X : 365
Y : 305
W : 426
H : 77
Text : 1234567 890128
---------------------------------------------------------
3.Code : EAN13
X : 365
Y : 455
W : 426
H : 77
Text : 0341098 765342
---------------------------------------------------------
4.Code : CODE128
X : 365
Y : 785
W : 686
H : 129
Text : EP12345678A1234
---------------------------------------------------------
5.Code : CODE39
X : 375
Y : 935
W : 496
H : 91
Text : TEST39Q
---------------------------------------------------------
6.Code : EAN13
X : 365
Y : 1865
W : 394
H : 71
Text : 9783890 115283
Nr. 3 is UPC-A :!:
UPC-A bar code is an EAN-13 bar code with the first EAN-13 number system digit set to "0".
UPC-A info´s : http://www.barcodeisland.com/upca.phtml

uploaded another test-sheet. Get it here Image

Results for the new test sheet :
Version: V0.8.48 BETA
---------------------------------------------------------
1.Code : CODE128
X : 241
Y : 665
W : 562
H : 104
Text : OCR-1-180
---------------------------------------------------------
2.Code : CODE128
X : 797
Y : 815
W : 566
H : 105
Text : OCR-1-000
---------------------------------------------------------
3.Code : EAN13
X : 379
Y : 1565
W : 422
H : 76
Text : 9783890 111803
---------------------------------------------------------
4.Code : EAN13
X : 797
Y : 1745
W : 426
H : 77
Text : 9783890 110004
Regards Klaus
Post Reply