Easy spell-checking with PB4 and Aspell

Share your advanced PureBasic knowledge/code with the community.
freak
PureBasic Team
PureBasic Team
Posts: 5940
Joined: Fri Apr 25, 2003 5:21 pm
Location: Germany

Easy spell-checking with PB4 and Aspell

Post by freak »

Here is an example of using the new Process library together with the GNU Aspell
program to add some spell checking functionality to your program.

The win32 version of Aspell as well as some language packs can be downloaded here:
http://aspell.net/win32/

My example uses Aspell in "pipe" mode. There is also a C api to access a dll or static lib,
but i did not get this to run right away, and this method is much simpler to do imho.
The example shows the basic stuff required to check the spelling. Aspell provides
much more stuff like adding words to a personal/global dictionary and so on.
Just look at the docs at http://aspell.sourceforge.net/ for more information.

Enjoy...

Code: Select all

; Example of easy spell-checking using the GNU Aspell spellchecker.
;
; The win32 port of Aspell as well as a number of ready to use language packs
; can be found here: http://aspell.net/win32/
;
; Download the package, install it and run this code.
;
; Aspell works quite simple:
; After starting the program with the "-a" option, it outputs one line of Version
; information, then waits for input. To check a line, simply write it to the
; Aspell program with WriteProgramStringN() (note the 'N' to write a newline!)
; Best is to append one "^" before the text to make sure it is not interpreted as a command.
;
; Aspell will return this (each on a single line):
; only "*" for each correct word in the input (this can be disabled, see the aspell manual)
; "# oldword wordoffset" for a wrong word with no suggestions (offset is the index of the wordstart in the input)
; "& oldword suggestioncount wordoffset: suggestion1, suggestion2, ..." for a word with possible suggestions
; empty line to indicate that the input was fully processed.
;
; Thats the basis, quite simple. Of course there are more commands and options.
; See the Aspell manual for that (http://aspell.sourceforge.net/), especially this page:
; http://aspell.sourceforge.net/man-html/Through-A-Pipe.html
;
; Ok, lets get started with the code, first just a few helper functions
; for coloring in the EditorGadget:
; ------------------------------------------------------------------------------------

; Selects Text inside an EditorGadget
; Line numbers range from 0 to CountGadgetItems(#Gadget)-1
; Char numbers range from 1 to the length of a line
; Set Line numbers to -1 to indicate the last line, and Char
; numbers to -1 to indicate the end of a line
; selecting from 0,1 to -1, -1 selects all.
Procedure Editor_Select(Gadget, LineStart.l, CharStart.l, LineEnd.l, CharEnd.l)   
  sel.CHARRANGE
  sel\cpMin = SendMessage_(GadgetID(Gadget), #EM_LINEINDEX, LineStart, 0) + CharStart - 1
 
  If LineEnd = -1
    LineEnd = SendMessage_(GadgetID(Gadget), #EM_GETLINECOUNT, 0, 0)-1
  EndIf
  sel\cpMax = SendMessage_(GadgetID(Gadget), #EM_LINEINDEX, LineEnd, 0)
 
  If CharEnd = -1
    sel\cpMax + SendMessage_(GadgetID(Gadget), #EM_LINELENGTH, sel\cpMax, 0)
  Else
    sel\cpMax + CharEnd - 1
  EndIf
  SendMessage_(GadgetID(Gadget), #EM_EXSETSEL, 0, @sel)
EndProcedure

; Set the Text color for the Selection
; in RGB format
Procedure Editor_Color(Gadget, Color.l)
  format.CHARFORMAT
  format\cbSize = SizeOf(CHARFORMAT)
  format\dwMask = #CFM_COLOR
  format\crTextColor = Color
  SendMessage_(GadgetID(Gadget), #EM_SETCHARFORMAT, #SCF_SELECTION, @format)
EndProcedure

; get line with the cursor
Procedure Editor_CursorLine(Gadget)
  SendMessage_(GadgetID(Gadget), #EM_GETSEL, @selStart, 0)
  ProcedureReturn SendMessage_(GadgetID(Gadget), #EM_LINEFROMCHAR, selStart, 0)
EndProcedure

; get the position of teh cursor in the line
Procedure Editor_CursorChar(Gadget)
  SendMessage_(GadgetID(Gadget), #EM_GETSEL, @selStart, 0)
  lineStart = SendMessage_(GadgetID(Gadget), #EM_LINEINDEX, SendMessage_(GadgetID(Gadget), #EM_LINEFROMCHAR, selStart, 0), 0)
  ProcedureReturn selStart - lineStart + 1
EndProcedure

; ------------------------------------------------------------------------------------

#Window = 0

#Editor  = 0
#Check   = 1
#Suggest = 2
#LangText= 3
#Lang    = 4
#LangSet = 5


; Request the Aspell executable (located in "bin" in the installation directory)
;
AspellExe$ = OpenFileRequester("Select Aspell.exe:", "Aspell.exe", "Executable Files (*.exe)|*.exe|All Files (*.*)|*.*", 0)
;AspellExe$ = "C:\Program Files\Aspell\bin\aspell.exe"

If AspellExe$ = ""
  End
EndIf

; Start the program. Read and Write flags are required of course to communicate
; We also read the error output, which is usefull for the language change below
; for example. "-a" starts the pipe mode.
;
Aspell = RunProgram(AspellExe$, "-a", "", #PB_Program_Hide|#PB_Program_Open|#PB_Program_Write|#PB_Program_Read|#PB_Program_Error)
If Aspell = 0
  MessageRequester("Error", "Cannot executa Aspell.exe")
  End
EndIf

; read the version string.
;
Version$ = ReadProgramString(Aspell)
If Version$ = ""
  ;
  ; if no version is given, it was probably an error, so read the error output
  ;
  MessageRequester("Error:", ReadProgramError(Aspell))  
  CloseProgram(Aspell)
  End
EndIf

; Aspell selects the language based on the locale setting. 
; The "$$l" command will print one line with the choosen language as result)
; (see below on how To set a different language.)
;
WriteProgramStringN(Aspell, "$$l")   ; write the command (the newline is important!)
Language$ = ReadProgramString(Aspell); read the result (will read one line of input)


; Open up a nice gui...
;
If OpenWindow(#Window, 0, 0, 450, 350, #PB_Window_SystemMenu|#PB_Window_ScreenCentered, "Spell correction made easy...")
  CreateGadgetList(WindowID(#Window))
  
  ButtonGadget(#Check, 5, 5, 120, 20, "Check Document")
  ButtonGadget(#Suggest, 130, 5, 120, 20, "Suggest words")
  TextGadget(#LangText, 255, 7, 90, 20, "Language:", #PB_Text_Right)  
  StringGadget(#Lang, 350, 5, 40, 20, Language$)
  ButtonGadget(#LangSet, 395, 5, 50, 20, "Set")
  EditorGadget(#Editor, 5, 30, 445, 310)  
  
  AddGadgetItem(#Editor, -1, "Version string: "+Version$)
  AddGadgetItem(#Editor, -1, "Type something and then press the 'Check Document' button.")
  
  Repeat
    Event = WaitWindowEvent()
    
    If Event = #PB_Event_Gadget
      
      Select EventGadget()
              
        Case #LangSet
          ;
          ; to select a new language, Aspell must be restarted with the --lang=<code>
          ; option. <code> should be the the language code like "en", "de", "fr", ...
          ;
          ; We first start the new program to see if all worked fine, so the old
          ; one can still be used if something went wrong
          ;
          NewLanguage$ = Trim(GetGadgetText(#Lang))          
          NewAspell = RunProgram(AspellExe$, "-a --lang="+NewLanguage$, "", #PB_Program_Hide|#PB_Program_Open|#PB_Program_Write|#PB_Program_Read|#PB_Program_Error)
          If NewAspell = 0          
            ; cannot run the exe at all
            ;
            MessageRequester("Error", "Cannot start Aspell.exe with new language!")            
            SetGadgetText(#Lang, Language$)
          Else
            Version$ = ReadProgramString(NewAspell)                       
            If Version$ = ""
              ; again, no version line probably means an error
              ;
              MessageRequester("Error:", ReadProgramError(NewAspell))
              SetGadgetText(#Lang, Language$)
              CloseProgram(NewAspell)
            Else
              ; success. we now end the old Aspell with CloseProgram()
              ; NOTE: CloseProgram() does not kill the program, but since it closes
              ; its input, Aspell will end itself.
              ;
              MessageRequester("Information:", "New Language set.")
              CloseProgram(Aspell)
              Aspell = NewAspell
            EndIf
          EndIf          
          
      
        Case #Check
          ; Now there is a general check for errors:
          ;
          ; First disable redrawing and remove all old colors
          ;
          SendMessage_(GadgetID(#Editor), #WM_SETREDRAW, 0, 0)
          Editor_Select(#Editor, 0, 1, -1, -1)
          Editor_Color(#Editor, $000000)
          
          ; Now go line by line and write it to Aspell
          lines = CountGadgetItems(#Editor)
          For i = 0 To lines - 1
            Text$ = GetGadgetItemText(#Editor, i, 0)
            WriteProgramStringN(Aspell, "^"+Text$)
            
            ; Read the output until the empty line is reached.
            ; Possible starts of result lines
            ; * - correct word (ignored)
            ; # - wrong word
            ; & - wrong word with suggestions
            ;
            ; We are not intrested in suggestions here, so we just get
            ; offset and length of the word in both cases to mark the word red
            ;
            Repeat
              Result$ = ReadProgramString(Aspell)
              If Left(Result$, 1) = "#" 
                Offset = Val(StringField(Result$, 3, " "))
                Length = Len(StringField(Result$, 2, " "))
                Editor_Select(#Editor, i, Offset, i, Offset+Length)
                Editor_Color(#Editor, $0000FF)
              
              ElseIf Left(Result$, 1) = "&" 
                Result$ = StringField(Result$, 1, ":") ; cut all the suggestions
                Offset = LineStart + Val(StringField(Result$, 4, " "))
                Length = Len(StringField(Result$, 2, " "))
                Editor_Select(#Editor, i, Offset, i, Offset+Length)
                Editor_Color(#Editor, $0000FF)
              EndIf 
              
            Until Result$ = ""            
          Next i
          
          ; reset selection and redraw the gadget
          ;
          Editor_Select(#Editor, 0, 1, 0, 1)
          SendMessage_(GadgetID(#Editor), #WM_SETREDRAW, 1, 0)
          InvalidateRect_(GadgetID(#Editor), 0, 0)
        
        Case #Suggest
          ; Here we are looking for suggestions on a word under the cursor.
          ;
          ; First get the cursor position
          ;
          Line   = Editor_CursorLine(#Editor)
          Cursor = Editor_CursorChar(#Editor)
          Text$  = GetGadgetItemText(#Editor, Line, 0)
          
          ; isolate the word from the line
          ;
          WordStart = Cursor
          WordEnd   = Cursor
          While WordStart > 0 And FindString("ABCDEFGHIJKLMNOPQRSTUVWXYZ", UCase(Mid(Text$, WordStart, 1)), 1) <> 0
            WordStart - 1
          Wend 
          While WordEnd < Len(Text$) And FindString("ABCDEFGHIJKLMNOPQRSTUVWXYZ", UCase(Mid(Text$, WordEnd, 1)), 1) <> 0
            WordEnd + 1
          Wend          
          
          Word$ = Mid(Text$, WordStart+1, WordEnd-WordStart)
                    
          If Word$ <> ""
          
            ; Now once again, pass it to Aspell
            ; 
            WriteProgramStringN(Aspell, "^"+Word$)
            
            ; Always use a loop until an empty line is returned. To be sure all
            ; processing is done
            ;            
            Repeat
              Result$ = ReadProgramString(Aspell)
              If Left(Result$, 1) = "*"  ; correct word
                MessageRequester("Suggestions:", "This word is correct.")
                
              ElseIf Left(Result$, 1) = "#" ; wrong word
                MessageRequester("Suggestions:", "There are no suggestions for this word.")
              
              ElseIf Left(Result$, 1) = "&" ; suggestions. they start after the ":"
                Suggestions$ = StringField(Result$, 2, ":")
                MessageRequester("Suggestions:", Suggestions$)
                
              EndIf 
              
            Until Result$ = ""               
          
          EndIf
      
      EndSelect
    
    EndIf    
  
  Until Event = #PB_Event_CloseWindow   
EndIf  

; Close Aspell before ending
;
CloseProgram(Aspell)
End
quidquid Latine dictum sit altum videtur
Intrigued
Enthusiast
Enthusiast
Posts: 501
Joined: Thu Jun 02, 2005 3:55 am
Location: U.S.A.

Post by Intrigued »

I would like to have a spell-checker .dll. So that my end-users do not have to (see: in bold):
Download the package, install it and run this code.
Ideas?

(TIA and thanks for this example!)
Intrigued - Registered PureBasic, lifetime updates user
freak
PureBasic Team
PureBasic Team
Posts: 5940
Joined: Fri Apr 25, 2003 5:21 pm
Location: Germany

Post by freak »

Well i can't get the dll version to load any language properly. Dunno exactly where the proplem is.
If i get it solved i will post some code, but i will propably not spend too much time on this in the near future.
quidquid Latine dictum sit altum videtur
rsts
Addict
Addict
Posts: 2736
Joined: Wed Aug 24, 2005 8:39 am
Location: Southwest OH - USA

Post by rsts »

Had occasion to use this in an application today.

Setup and use was a breeze thanks to the wonderful explanation and example.

Thanks (a little belated) for another nice one Freak :D

cheers
gnozal
PureBasic Expert
PureBasic Expert
Posts: 4229
Joined: Sat Apr 26, 2003 8:27 am
Location: Strasbourg / France
Contact:

Post by gnozal »

Here is some code that seems to work with the Aspell DLL (adapted from FB code).

Code: Select all

;
; Aspell (http://aspell.net) - include file
;
AspelDLL_Filename.s = "c:\Program Files\Aspell\bin\aspell-15.dll"
;
PrototypeC proto_new_aspell_config()
PrototypeC proto_aspell_config_replace(*Aspell, ConfigKey.s, ConfigValue.s)
PrototypeC proto_new_aspell_speller(*Aspell)
PrototypeC proto_aspell_error_number(PossibleError.l)
PrototypeC proto_aspell_error_message(PossibleError.l)
PrototypeC proto_to_aspell_speller(PossibleError.l)
PrototypeC proto_aspell_speller_check(*SpellChecker, *InputWord, InputWordLen.l)
PrototypeC proto_aspell_speller_suggest(*SpellChecker, *InputWord, InputWordLen.l)
PrototypeC proto_aspell_word_list_elements(*Suggestions)
PrototypeC proto_aspell_string_enumeration_next(*Elements)
PrototypeC proto_delete_aspell_string_enumeration(*Elements)
PrototypeC proto_delete_aspell_speller(*SpellChecker)
;
AspellDLL = OpenLibrary(#PB_Any, AspelDLL_Filename)
If AspellDLL
  ;
  ;
  ;
  new_aspell_config.proto_new_aspell_config = GetFunction(AspellDLL, "new_aspell_config")
  If new_aspell_config
    aspell_config_replace.proto_aspell_config_replace = GetFunction(AspellDLL, "aspell_config_replace")
    new_aspell_speller.proto_new_aspell_speller = GetFunction(AspellDLL, "new_aspell_speller")
    aspell_error_number.proto_aspell_error_number = GetFunction(AspellDLL, "aspell_error_number")
    aspell_error_message.proto_aspell_error_message = GetFunction(AspellDLL, "aspell_error_message")
    to_aspell_speller.proto_to_aspell_speller = GetFunction(AspellDLL, "to_aspell_speller")
    aspell_speller_check.proto_aspell_speller_check = GetFunction(AspellDLL, "aspell_speller_check")
    aspell_speller_suggest.proto_aspell_speller_suggest = GetFunction(AspellDLL, "aspell_speller_suggest")
    aspell_word_list_elements.proto_aspell_word_list_elements = GetFunction(AspellDLL, "aspell_word_list_elements")
    aspell_string_enumeration_next.proto_aspell_string_enumeration_next = GetFunction(AspellDLL, "aspell_string_enumeration_next")
    delete_aspell_string_enumeration.proto_delete_aspell_string_enumeration = GetFunction(AspellDLL, "delete_aspell_string_enumeration")
    delete_aspell_speller.proto_delete_aspell_speller = GetFunction(AspellDLL, "delete_aspell_speller")
    ;
    *Aspell = new_aspell_config()
    aspell_config_replace(*Aspell, "lang", "en_US") ; set US english dictionary
    PossibleError = new_aspell_speller(*Aspell)
    If aspell_error_number(PossibleError)
      Debug aspell_error_message(PossibleError)
    Else
      *SpellChecker = to_aspell_speller(PossibleError)
      If *SpellChecker
        ;
        ;
        ;
        TestSentence.s = "We have put a lot of efort into its realization to produce a fast, relliable and system friendly langage"
        ;
        TestSentence = Trim(TestSentence)
        While FindString(TestSentence, "  ", 1)
          TestSentence = ReplaceString(TestSentence, "  ", " ")
        Wend
        TestSentence = RemoveString(TestSentence, ".")
        TestSentence = RemoveString(TestSentence, ",")
        TestSentence = RemoveString(TestSentence, ";")
        TestSentence = RemoveString(TestSentence, ":")
        WordTotal = CountString(TestSentence, " ") + 1
        ;
        For WordCount = 1 To WordTotal
          TestWord.s = StringField(TestSentence, WordCount, " ")
          ;
          Correct = aspell_speller_check(*SpellChecker, @TestWord, Len(TestWord))
          If Correct
            Debug TestWord + " is right"
          Else
            Debug TestWord + " is false"
            *Suggestions = aspell_speller_suggest(*SpellChecker, @TestWord, Len(TestWord))
            If *Suggestions
              *Elements = aspell_word_list_elements(*Suggestions)
              If *Elements
                Repeat
                  *Word = aspell_string_enumeration_next(*Elements)
                  If *Word 
                    Debug "Sugestion : " + PeekS(*Word)
                  EndIf
                Until *Word = #False
                delete_aspell_string_enumeration(*Elements)
              EndIf
            EndIf
          EndIf
          ;
        Next
        ;
        ;
        delete_aspell_speller(*SpellChecker)
      EndIf
    EndIf
    ;
    ;
    ;
  EndIf
  CloseLibrary(AspellDLL)
EndIf
For free libraries and tools, visit my web site (also home of jaPBe V3 and PureFORM).
rsts
Addict
Addict
Posts: 2736
Joined: Wed Aug 24, 2005 8:39 am
Location: Southwest OH - USA

Post by rsts »

Thanks gnozal.

When I have some time I will give this a try.

cheers
User avatar
Rescator
Addict
Addict
Posts: 1769
Joined: Sat Feb 19, 2005 5:05 pm
Location: Norway

Post by Rescator »

Might be worth taking a look at hunspell (used in OpenOffice.org and Firefox 3 etc)
http://hunspell.sourceforge.net/

No dll or lib sadly, but compared to the aspell port (which is from 2002?) at least hunspell is more recently updated.
User avatar
Rescator
Addict
Addict
Posts: 1769
Joined: Sat Feb 19, 2005 5:05 pm
Location: Norway

Post by Rescator »

Enchant might be worth a look (dictionary lib frontend/wrapper).
http://www.abisource.com/projects/enchant/
User avatar
Rescator
Addict
Addict
Posts: 1769
Joined: Sat Feb 19, 2005 5:05 pm
Location: Norway

Post by Rescator »

Hey freak, i found this: http://www.codeproject.com/KB/recipes/s ... ntrol.aspx

Maybe do basically the same and add Hunspell support lib for PureBasic?
Just build hunspell as noted in that article and put the dll into the PureBasic package, and a normal PB lib for the functions obviously.
That way one could just call hunspell functions and copy the dll along with the application. Hunspell is certainly one of the better free ones and it's LGPL allows re-distribution in .dll and .so form.

And the amount of dictionaries available for Hunspell is huge: http://wiki.services.openoffice.org/wiki/Dictionaries
Mistrel
Addict
Addict
Posts: 3415
Joined: Sat Jun 30, 2007 8:04 pm

Post by Mistrel »

I don't think it would be a good idea to add a spell checker as a static library. These kind of libraries can be very large and become outdated quickly as new versions are released.
User avatar
Rook Zimbabwe
Addict
Addict
Posts: 4322
Joined: Tue Jan 02, 2007 8:16 pm
Location: Cypress TX
Contact:

Post by Rook Zimbabwe »

I agree with Mistrel, BUT Rescator... go for it! there is a need and you could build it better than I could! :D
Binarily speaking... it takes 10 to Tango!!!

Image
http://www.bluemesapc.com/
Post Reply