Little SAPI4 and SAPI5 UserLibrary to test

fweil · Post by **fweil** » Mon Jul 25, 2005 9:41 pm

Seems that it uses the TTS layer of the OS, but just we are trying to deal with additional voices from third party providers.

ebs · Post by **ebs** » Tue Jul 26, 2005 12:25 am

Esteban's library uses the built-in Windows SAPI speech engine.

If you install more SAPI-compliant voices, they will show up in the list of enumerated voices.
When you install a third-party speech engine, it registers the voices in the Windows Registry.
You can enumerate all the installed voices using the TTSEngCount() and TTSEngName() functions in the library.

For example, I have the lots of voices, including the standard Microsoft voices,
the AT&T Natural Voices Mike and Crystal US English voices, the AT&T Audrey and Charles UK English voices,
and the NeoSpeech Paul and Kate US English voices. All of these show up in the list of voices in the sample application.
The list is the same as the one in the Windows Control Panel Speech applet, at least in Windows XP.

Eric

Gansta93 · Post by **Gansta93** » Tue Jul 26, 2005 12:30 am

But the problem is that fweil's voices don't apeare in voices list. If someone knows where we can find french voices which apear, please say where.

ebs · Post by **ebs** » Tue Jul 26, 2005 12:54 am

Gangsta93 and fweil,

I believe that's because fweil's voices are not SAPI 4 or 5 compliant. I think they may be from Windows 98 or ME, but I'm not sure.
If so, they came before SAPI 4/5 and probably won't work.

However, all of the AT&T Natural Voices are SAPI compliant and you can get the Alain and Juliette 16KHz French voices for $35 each.
They can be purchased from the NextUp.com web site link given earlier.
If these are the first AT&T Natural Voices you are using, you must also buy the voice engine, which costs an additional $25 for the 16 KHZ version.

Eric

Esteban1 · Post by **Esteban1** » Tue Jul 26, 2005 1:24 am

Thank you all for testing the library!

To fweil: Are you sure you have the SAPI4 runtime?
If you look at the "Voice" item in your Control Panel you can verify it. If you have SAPI4 runtime installed in your computer you will find the enumeration of SAPI4 compliant voices installed.
If you have not, then you can download the runtime at:

http://www.mbsoft.biz/download/spchapi.exe

In the same site you can find some awesome free voices from IBM

http://www.mbstudio.biz/mbsoft_005.htm

Also you can do a little search for "Digalo 2000"... you will know.

I'm working on the wav output, maybe tonight...
And thank you very much for the corrected events example, nice work!

To GeoTrail: the library functions just use the Microsoft's SAPI enviroment.

To Gansta93: PureLibraries used: String, StringExtension, Requester, Memory, Library and SimpleList. Dlls used:KERNEL32, OLEAUT32 and OLE32.

Intrigued · Post by **Intrigued** » Tue Jul 26, 2005 2:28 am

Thanks for sharing!

*thumbs up*

Esteban1 · Post by **Esteban1** » Tue Jul 26, 2005 8:33 am

The library is updated. Now you can send speech output to a .wav file.

In the TTSSpeak function there is an optional parameter for the name of a wav file that will receive the speech instead of the audio device. For SAPI 4 voices, the events work the same way as dispatching the output to the soundcard, but for SAPI 5 the Word Position event is not triggered (not my fault, SAPI 5 does not fire ANY event at all while recording to a file, I had to make some tricks to get the Audio Start and Audio Stop events working).

Please test it to find bugs (I’m just a beginner).

Gansta93 · Post by **Gansta93** » Tue Jul 26, 2005 10:45 am

Hello,

Thanks for this update.

About your answer: Thanks. But there are probably others dlls to install with my application. If I want to install my application on Windows 98, what files I must include?
About Fweil's voices: it is the same thing on windows 98, I am on Windows 98.
About sources: can you answer me please?

TerryHough · Post by **TerryHough** » Tue Jul 26, 2005 5:49 pm

Thanks for the TTS library

Here is a gui version of the voice enumeration I did for my use. Feel
free to use it.

Code: Select all

Code removed here... see subsequent post for updated version

I would like to suggest that the TTSInit() command return a positive
value on success and a 0 value on failure (not just a message)

Then

Code: Select all

If TTSInit(0,0,0)
  ; do all the speech stuff
Else
  ; notifiy that the init failed.
Endif

which makes the code safer.

Terry

Esteban1 · Post by **Esteban1** » Wed Jul 27, 2005 1:01 am

To Gansta93: I really don't know how many dll's must be installed in your system, SAPI works with COM, the components not only have to be present but registered in the system. You can use oleview32 from Microsoft to explore the components, interfaces CLSID's and IID's of every control in your pc. The most simple way to get all you need is to use the installation packages supplied for SAPI by Microsoft (just use the links I pointed before). About the sources, I will not release them for now, but you can ask me whatever you want to know, anytime.

To TerryHough: You're right, i forgot to implement the return values of TTSInit(), I've just corrected my mistake. Now TTSInit() returns 0 on failure, 1 if you have SAPI5 installed, 2 if you have SAPI4, and 3 if you have both SAPI4 and SAPI5. You (and everyone interested) can download the upgraded library at the same site:

http://geocities.com/esteban1uy/My_drive.html

Here is an example using the speak and speak to file functions:

Code: Select all

Global RecFlag.b
Global TxtLen.l

Procedure PositionEvent(Charac.l) ; This is the OnPosition event function.
    StatusBarText(0,1," Position = "+Str(Charac*100/TxtLen)+"%")
EndProcedure

Procedure StartedEvent() ; This is the OnAudioStart event function.
  StatusBarText(0,2,"")
  If RecFlag=0
    StatusBarText(0,0," Speaking")
  Else
    StatusBarText(0,0," Recording")
  EndIf
EndProcedure

Procedure EndedEvent() ; And this is the OnAudioStop event function.
  StatusBarText(0,0," Finished")
  StatusBarText(0,1,"")
EndProcedure
  

OpenWindow(0, 357, 98, 375, 65,  #PB_Window_TitleBar | #PB_Window_ScreenCentered | #PB_Window_SystemMenu, "PureTTS Clipboard Player")

CreateStatusBar(0, WindowID())
AddStatusBarField(100)
AddStatusBarField(100)
AddStatusBarField(300)
StatusBarText(0,2," Initializing SAPI...")

TTSInit(@StartedEvent(),@EndedEvent(),@PositionEvent()) ; Here we initialize the SAPI enviroment.

CreateGadgetList(WindowID())
ButtonGadget(0, 5, 5, 60, 30, "Play",#PB_Button_Default)
ButtonGadget(1, 70, 5, 60, 30, "Record")
ButtonGadget(2, 135, 5, 60, 30, "Stop")
ButtonGadget(3, 200, 5, 60, 30, "Pause", #PB_Button_Toggle)
ListViewGadget(4,270,5,100,30)

StatusBarText(0,2," Enumerating voices...")

For i=0 To TTSEngCount() ; Retrieve the voice names and store them.
  AddGadgetItem (4,-1,TTSEngName(i))
Next
SetGadgetState (4,0)
StatusBarText(0,2," Just copy some text and click Play")

Repeat
  EventID = WaitWindowEvent() 
  Select EventID
    Case #PB_Event_Gadget
      Select EventGadgetID()
        Case 0
          txt$=GetClipboardText()
          TxtLen=Len(txt$)
          If TxtLen>0
            RecFlag=0 ; We are not recording.
            TTSSpeak(txt$) ; Speak the clipboard text.
          EndIf
        Case 1
          txt$=GetClipboardText()
          TxtLen=Len(txt$)
          If TxtLen>0
            WavFile.s = SaveFileRequester("Save speech as wav",".wav", "Audio files|*.wav|All files|*.*", 0) 
            If WavFile <> ""
              RecFlag=1 ; We are recording.
              TTSSpeak(txt$,WavFile) ; Send the speech to the designated file.
            EndIf
          EndIf
        Case 2
          TTSStop() ; Just stop speaking or recording.
        Case 3
          If GetGadgetState(3)=1
            TTSPause() ; Self explained.
          Else
            TTSResume() ; This too.
          EndIf
        Case 4
          i=GetGadgetState(4)
          TTSSelect(i) ; Select a voice by its index.
      EndSelect 
  EndSelect
Until EventID = #PB_Event_CloseWindow
 
TTSEnd() ; Free SAPI resources.
 
End

Esteban1

TerryHough · Post by **TerryHough** » Wed Jul 27, 2005 2:33 pm

Esteban1 wrote:To TerryHough: I forgot to implement the return values of TTSInit(), I've just corrected my mistake. Now TTSInit() returns 0 on failure, 1 if you have SAPI5 installed, 2 if you have SAPI4, and 3 if you have both SAPI4 and SAPI5.

Thanks. And the results are really useful.

I've modified my gui voice enumeration code to take advantage of it.

Code: Select all

; Voice_Enumeration by TerryHough  26 Jul, 2005
; modified to gui from example by Esteban1
; see PB Forum topic http://forums.purebasic.com/english/viewtopic.php?t=16034

LoadFont(1, "Arial", 12, #PB_Font_Bold )

If OpenWindow(0, 0, 0, 480, 440, #PB_Window_SystemMenu|#PB_Window_ScreenCentered, "Text to Speech - Installed Voices")
  CreateGadgetList(WindowID())
  SetGadgetFont(#PB_Default,1)
  TextGadget(2, 10, 10, 460, 20, "",#PB_Text_Center)
  TextGadget(3, 10, 32, 460, 20, "Available Voices", #PB_Text_Center|#PB_Text_Border)
  SetGadgetFont(#PB_Default,#PB_Default)
  ListIconGadget(1, 10, 50, 460, 380, "Index", 40, #PB_ListIcon_GridLines|#PB_ListIcon_FullRowSelect)
  AddGadgetColumn(1,1, "Voice Name", 140)
  AddGadgetColumn(1,2, "Version", 60)
  AddGadgetColumn(1,3, "Manufacturer", 200)
  
  engine =  TTSInit(0,0,0) ; First we initialize the TTS support.
  Select engine
  Case 1
    SetGadgetText(2,"SAPI Vs 5 is installed.")
  Case 2
    SetGadgetText(2,"SAPI Vs 4 is installed.")
  Case 3
    SetGadgetText(2,"SAPI Vs 4 and Vs 5 are installed.")
  EndSelect
  If engine
    voices.l = TTSEngCount() ; Get the highest voice index.
    If voices > -1 ; If there are any voice installed...
      ;Remember that voice index starts from 0
      For i=0 To voices
        name$         = Trim(TTSEngName(i)) ; Retrieve voice name.
        manufacturer$ = Trim(TTSEngMfg(i))  ; Retrieve voice manufacturer info
        version.l     = TTSSAPIVer(i)       ; Retrieve voice's SAPI version
        AddGadgetItem(1,-1, Str(i) + Chr(10) + name$ + Chr(10) + "SAPI" + Str(version) + Chr(10) + manufacturer$)
      Next
      TTSEnd() ; Finally we close the TTS support.
    EndIf
    Repeat
      Select WaitWindowEvent()
      Case #PB_EventCloseWindow
        End
      Case #PB_EventGadget
        Select EventGadgetID()
        Case 1 ; ListIconGadget
        EndSelect
      EndSelect
    ForEver
  Else
    MessageRequester("Text to Speech","Unable to initialize the Text to Speech system.",#MB_ICONERROR)
  EndIf
EndIf
End

Gansta93 · Post by **Gansta93** » Wed Jul 27, 2005 3:40 pm

Hello,

@Fweil: the voices you've found are for SAPI 4. But when I install SAPI 4 and install your voices, I have a big error when calling TTSInit() command.

TerryHough · Post by **TerryHough** » Wed Jul 27, 2005 10:12 pm

@Esteban1

Maybe I just don't understand the events.

But it appears to me that the PositionEvent() doesn't fire upon completion
of the speaking. I never get 100% reported. Always missing the last
word (which is spoken successfully).

Could that have been overlooked?

It is rather trivial, but would make accurate progress reporting better.

Thanks,
Terry

Esteban1 · Post by **Esteban1** » Thu Jul 28, 2005 12:56 am

To TerryHough:

The PositionEvent is fired at the START of every spoken word and gets informed (by its parameter) of the exact character position at the START of that word in the whole text. For example, if you put "The quick brown fox jumps over the lazy dog" in TTSSpeak() then the PositionEvent will fire at the start of "The" informed of the position as 0, then at the start of "quick" informed 4, then at the start of "brown" informed 10, and so. But when the speech arrives to "dog" (the last word), the informed position will be 40, but the whole text is 43 characters long. The PositionEvent function will not fire at the end of the speech, but the EndedEvent will, so you can modify it to put 100% at the statusbar or, instead of retrieve the length of the whole text with Len() and then use it to calculate the position percentage, you can calculate the total length minus the length of the last word. You must remember that the last option will fire a 100% completion when the last word starts to be spoken.

Just change the EndedEvent() for this:

Code: Select all

Procedure EndedEvent() ; And this is the OnAudioStop event function. 
  StatusBarText(0,0," Finished") 
  StatusBarText(0,1," Position = 100%") 
EndProcedure

Or if you want to "clean" the StatusBar field after a second:

Code: Select all

Procedure EndedEvent() ; And this is the OnAudioStop event function.
  StatusBarText(0,0," Finished")
  StatusBarText(0,1," Position = 100%")
  Delay(1000)
  StatusBarText(0,1,"")
EndProcedure

Thank you very much for pointing me this issue, I think I didn't explain it correctly in the help file.

TerryHough · Post by **TerryHough** » Thu Jul 28, 2005 3:40 am

Thanks, that is exactly how I had handled it.

I just wanted to make sure it wasn't an oversight in the PositionEvent
handling. Not the first time I have seen Microsoft handle positioning in
that manner.

I appreciate your response and your sharing the library.

Terry