Little SAPI4 and SAPI5 UserLibrary to test
Seems that it uses the TTS layer of the OS, but just we are trying to deal with additional voices from third party providers.
My avatar is a small copy of the 4x1.8m image I created and exposed at 'Le salon international du meuble à Paris' january 2004 in Matt Sindall's 'Shades' designers exhibition. The original laminated print was designed using a 150 dpi printout.
Esteban's library uses the built-in Windows SAPI speech engine.
If you install more SAPI-compliant voices, they will show up in the list of enumerated voices.
When you install a third-party speech engine, it registers the voices in the Windows Registry.
You can enumerate all the installed voices using the TTSEngCount() and TTSEngName() functions in the library.
For example, I have the lots of voices, including the standard Microsoft voices,
the AT&T Natural Voices Mike and Crystal US English voices, the AT&T Audrey and Charles UK English voices,
and the NeoSpeech Paul and Kate US English voices. All of these show up in the list of voices in the sample application.
The list is the same as the one in the Windows Control Panel Speech applet, at least in Windows XP.
Eric
If you install more SAPI-compliant voices, they will show up in the list of enumerated voices.
When you install a third-party speech engine, it registers the voices in the Windows Registry.
You can enumerate all the installed voices using the TTSEngCount() and TTSEngName() functions in the library.
For example, I have the lots of voices, including the standard Microsoft voices,
the AT&T Natural Voices Mike and Crystal US English voices, the AT&T Audrey and Charles UK English voices,
and the NeoSpeech Paul and Kate US English voices. All of these show up in the list of voices in the sample application.
The list is the same as the one in the Windows Control Panel Speech applet, at least in Windows XP.
Eric
Gangsta93 and fweil,
I believe that's because fweil's voices are not SAPI 4 or 5 compliant. I think they may be from Windows 98 or ME, but I'm not sure.
If so, they came before SAPI 4/5 and probably won't work.
However, all of the AT&T Natural Voices are SAPI compliant and you can get the Alain and Juliette 16KHz French voices for $35 each.
They can be purchased from the NextUp.com web site link given earlier.
If these are the first AT&T Natural Voices you are using, you must also buy the voice engine, which costs an additional $25 for the 16 KHZ version.
Eric
I believe that's because fweil's voices are not SAPI 4 or 5 compliant. I think they may be from Windows 98 or ME, but I'm not sure.
If so, they came before SAPI 4/5 and probably won't work.
However, all of the AT&T Natural Voices are SAPI compliant and you can get the Alain and Juliette 16KHz French voices for $35 each.
They can be purchased from the NextUp.com web site link given earlier.
If these are the first AT&T Natural Voices you are using, you must also buy the voice engine, which costs an additional $25 for the 16 KHZ version.
Eric
Thank you all for testing the library!
To fweil: Are you sure you have the SAPI4 runtime?
If you look at the "Voice" item in your Control Panel you can verify it. If you have SAPI4 runtime installed in your computer you will find the enumeration of SAPI4 compliant voices installed.
If you have not, then you can download the runtime at:
http://www.mbsoft.biz/download/spchapi.exe
In the same site you can find some awesome free voices from IBM
http://www.mbstudio.biz/mbsoft_005.htm
Also you can do a little search for "Digalo 2000"... you will know.
I'm working on the wav output, maybe tonight...
And thank you very much for the corrected events example, nice work!
To GeoTrail: the library functions just use the Microsoft's SAPI enviroment.
To Gansta93: PureLibraries used: String, StringExtension, Requester, Memory, Library and SimpleList. Dlls used:KERNEL32, OLEAUT32 and OLE32.
To fweil: Are you sure you have the SAPI4 runtime?
If you look at the "Voice" item in your Control Panel you can verify it. If you have SAPI4 runtime installed in your computer you will find the enumeration of SAPI4 compliant voices installed.
If you have not, then you can download the runtime at:
http://www.mbsoft.biz/download/spchapi.exe
In the same site you can find some awesome free voices from IBM
http://www.mbstudio.biz/mbsoft_005.htm
Also you can do a little search for "Digalo 2000"... you will know.
I'm working on the wav output, maybe tonight...
And thank you very much for the corrected events example, nice work!
To GeoTrail: the library functions just use the Microsoft's SAPI enviroment.
To Gansta93: PureLibraries used: String, StringExtension, Requester, Memory, Library and SimpleList. Dlls used:KERNEL32, OLEAUT32 and OLE32.
The library is updated. Now you can send speech output to a .wav file.
In the TTSSpeak function there is an optional parameter for the name of a wav file that will receive the speech instead of the audio device. For SAPI 4 voices, the events work the same way as dispatching the output to the soundcard, but for SAPI 5 the Word Position event is not triggered (not my fault, SAPI 5 does not fire ANY event at all while recording to a file, I had to make some tricks to get the Audio Start and Audio Stop events working).
Please test it to find bugs (I’m just a beginner).
In the TTSSpeak function there is an optional parameter for the name of a wav file that will receive the speech instead of the audio device. For SAPI 4 voices, the events work the same way as dispatching the output to the soundcard, but for SAPI 5 the Word Position event is not triggered (not my fault, SAPI 5 does not fire ANY event at all while recording to a file, I had to make some tricks to get the Audio Start and Audio Stop events working).
Please test it to find bugs (I’m just a beginner).
Hello,
Thanks for this update.
About your answer: Thanks. But there are probably others dlls to install with my application. If I want to install my application on Windows 98, what files I must include?
About Fweil's voices: it is the same thing on windows 98, I am on Windows 98.
About sources: can you answer me please?
Thanks for this update.

About your answer: Thanks. But there are probably others dlls to install with my application. If I want to install my application on Windows 98, what files I must include?
About Fweil's voices: it is the same thing on windows 98, I am on Windows 98.
About sources: can you answer me please?

-
- Enthusiast
- Posts: 781
- Joined: Fri Apr 25, 2003 6:51 pm
- Location: NC, USA
- Contact:
Thanks for the TTS library
Here is a gui version of the voice enumeration I did for my use. Feel
free to use it.
I would like to suggest that the TTSInit() command return a positive
value on success and a 0 value on failure (not just a message)
Then
which makes the code safer.
Terry
Here is a gui version of the voice enumeration I did for my use. Feel
free to use it.
Code: Select all
Code removed here... see subsequent post for updated version
value on success and a 0 value on failure (not just a message)
Then
Code: Select all
If TTSInit(0,0,0)
; do all the speech stuff
Else
; notifiy that the init failed.
Endif
Terry
Last edited by TerryHough on Wed Jul 27, 2005 2:41 pm, edited 1 time in total.
To Gansta93: I really don't know how many dll's must be installed in your system, SAPI works with COM, the components not only have to be present but registered in the system. You can use oleview32 from Microsoft to explore the components, interfaces CLSID's and IID's of every control in your pc. The most simple way to get all you need is to use the installation packages supplied for SAPI by Microsoft (just use the links I pointed before). About the sources, I will not release them for now, but you can ask me whatever you want to know, anytime.
To TerryHough: You're right, i forgot to implement the return values of TTSInit(), I've just corrected my mistake. Now TTSInit() returns 0 on failure, 1 if you have SAPI5 installed, 2 if you have SAPI4, and 3 if you have both SAPI4 and SAPI5. You (and everyone interested) can download the upgraded library at the same site:
http://geocities.com/esteban1uy/My_drive.html
Here is an example using the speak and speak to file functions:
Esteban1
To TerryHough: You're right, i forgot to implement the return values of TTSInit(), I've just corrected my mistake. Now TTSInit() returns 0 on failure, 1 if you have SAPI5 installed, 2 if you have SAPI4, and 3 if you have both SAPI4 and SAPI5. You (and everyone interested) can download the upgraded library at the same site:
http://geocities.com/esteban1uy/My_drive.html
Here is an example using the speak and speak to file functions:
Code: Select all
Global RecFlag.b
Global TxtLen.l
Procedure PositionEvent(Charac.l) ; This is the OnPosition event function.
StatusBarText(0,1," Position = "+Str(Charac*100/TxtLen)+"%")
EndProcedure
Procedure StartedEvent() ; This is the OnAudioStart event function.
StatusBarText(0,2,"")
If RecFlag=0
StatusBarText(0,0," Speaking")
Else
StatusBarText(0,0," Recording")
EndIf
EndProcedure
Procedure EndedEvent() ; And this is the OnAudioStop event function.
StatusBarText(0,0," Finished")
StatusBarText(0,1,"")
EndProcedure
OpenWindow(0, 357, 98, 375, 65, #PB_Window_TitleBar | #PB_Window_ScreenCentered | #PB_Window_SystemMenu, "PureTTS Clipboard Player")
CreateStatusBar(0, WindowID())
AddStatusBarField(100)
AddStatusBarField(100)
AddStatusBarField(300)
StatusBarText(0,2," Initializing SAPI...")
TTSInit(@StartedEvent(),@EndedEvent(),@PositionEvent()) ; Here we initialize the SAPI enviroment.
CreateGadgetList(WindowID())
ButtonGadget(0, 5, 5, 60, 30, "Play",#PB_Button_Default)
ButtonGadget(1, 70, 5, 60, 30, "Record")
ButtonGadget(2, 135, 5, 60, 30, "Stop")
ButtonGadget(3, 200, 5, 60, 30, "Pause", #PB_Button_Toggle)
ListViewGadget(4,270,5,100,30)
StatusBarText(0,2," Enumerating voices...")
For i=0 To TTSEngCount() ; Retrieve the voice names and store them.
AddGadgetItem (4,-1,TTSEngName(i))
Next
SetGadgetState (4,0)
StatusBarText(0,2," Just copy some text and click Play")
Repeat
EventID = WaitWindowEvent()
Select EventID
Case #PB_Event_Gadget
Select EventGadgetID()
Case 0
txt$=GetClipboardText()
TxtLen=Len(txt$)
If TxtLen>0
RecFlag=0 ; We are not recording.
TTSSpeak(txt$) ; Speak the clipboard text.
EndIf
Case 1
txt$=GetClipboardText()
TxtLen=Len(txt$)
If TxtLen>0
WavFile.s = SaveFileRequester("Save speech as wav",".wav", "Audio files|*.wav|All files|*.*", 0)
If WavFile <> ""
RecFlag=1 ; We are recording.
TTSSpeak(txt$,WavFile) ; Send the speech to the designated file.
EndIf
EndIf
Case 2
TTSStop() ; Just stop speaking or recording.
Case 3
If GetGadgetState(3)=1
TTSPause() ; Self explained.
Else
TTSResume() ; This too.
EndIf
Case 4
i=GetGadgetState(4)
TTSSelect(i) ; Select a voice by its index.
EndSelect
EndSelect
Until EventID = #PB_Event_CloseWindow
TTSEnd() ; Free SAPI resources.
End
Esteban1
-
- Enthusiast
- Posts: 781
- Joined: Fri Apr 25, 2003 6:51 pm
- Location: NC, USA
- Contact:
Thanks. And the results are really useful.Esteban1 wrote:To TerryHough: I forgot to implement the return values of TTSInit(), I've just corrected my mistake. Now TTSInit() returns 0 on failure, 1 if you have SAPI5 installed, 2 if you have SAPI4, and 3 if you have both SAPI4 and SAPI5.
I've modified my gui voice enumeration code to take advantage of it.
Code: Select all
; Voice_Enumeration by TerryHough 26 Jul, 2005
; modified to gui from example by Esteban1
; see PB Forum topic http://forums.purebasic.com/english/viewtopic.php?t=16034
LoadFont(1, "Arial", 12, #PB_Font_Bold )
If OpenWindow(0, 0, 0, 480, 440, #PB_Window_SystemMenu|#PB_Window_ScreenCentered, "Text to Speech - Installed Voices")
CreateGadgetList(WindowID())
SetGadgetFont(#PB_Default,1)
TextGadget(2, 10, 10, 460, 20, "",#PB_Text_Center)
TextGadget(3, 10, 32, 460, 20, "Available Voices", #PB_Text_Center|#PB_Text_Border)
SetGadgetFont(#PB_Default,#PB_Default)
ListIconGadget(1, 10, 50, 460, 380, "Index", 40, #PB_ListIcon_GridLines|#PB_ListIcon_FullRowSelect)
AddGadgetColumn(1,1, "Voice Name", 140)
AddGadgetColumn(1,2, "Version", 60)
AddGadgetColumn(1,3, "Manufacturer", 200)
engine = TTSInit(0,0,0) ; First we initialize the TTS support.
Select engine
Case 1
SetGadgetText(2,"SAPI Vs 5 is installed.")
Case 2
SetGadgetText(2,"SAPI Vs 4 is installed.")
Case 3
SetGadgetText(2,"SAPI Vs 4 and Vs 5 are installed.")
EndSelect
If engine
voices.l = TTSEngCount() ; Get the highest voice index.
If voices > -1 ; If there are any voice installed...
;Remember that voice index starts from 0
For i=0 To voices
name$ = Trim(TTSEngName(i)) ; Retrieve voice name.
manufacturer$ = Trim(TTSEngMfg(i)) ; Retrieve voice manufacturer info
version.l = TTSSAPIVer(i) ; Retrieve voice's SAPI version
AddGadgetItem(1,-1, Str(i) + Chr(10) + name$ + Chr(10) + "SAPI" + Str(version) + Chr(10) + manufacturer$)
Next
TTSEnd() ; Finally we close the TTS support.
EndIf
Repeat
Select WaitWindowEvent()
Case #PB_EventCloseWindow
End
Case #PB_EventGadget
Select EventGadgetID()
Case 1 ; ListIconGadget
EndSelect
EndSelect
ForEver
Else
MessageRequester("Text to Speech","Unable to initialize the Text to Speech system.",#MB_ICONERROR)
EndIf
EndIf
End
-
- Enthusiast
- Posts: 781
- Joined: Fri Apr 25, 2003 6:51 pm
- Location: NC, USA
- Contact:
@Esteban1
Maybe I just don't understand the events.
But it appears to me that the PositionEvent() doesn't fire upon completion
of the speaking. I never get 100% reported. Always missing the last
word (which is spoken successfully).
Could that have been overlooked?
It is rather trivial, but would make accurate progress reporting better.
Thanks,
Terry
Maybe I just don't understand the events.

But it appears to me that the PositionEvent() doesn't fire upon completion
of the speaking. I never get 100% reported. Always missing the last
word (which is spoken successfully).
Could that have been overlooked?
It is rather trivial, but would make accurate progress reporting better.
Thanks,
Terry
To TerryHough:
The PositionEvent is fired at the START of every spoken word and gets informed (by its parameter) of the exact character position at the START of that word in the whole text. For example, if you put "The quick brown fox jumps over the lazy dog" in TTSSpeak() then the PositionEvent will fire at the start of "The" informed of the position as 0, then at the start of "quick" informed 4, then at the start of "brown" informed 10, and so. But when the speech arrives to "dog" (the last word), the informed position will be 40, but the whole text is 43 characters long. The PositionEvent function will not fire at the end of the speech, but the EndedEvent will, so you can modify it to put 100% at the statusbar or, instead of retrieve the length of the whole text with Len() and then use it to calculate the position percentage, you can calculate the total length minus the length of the last word. You must remember that the last option will fire a 100% completion when the last word starts to be spoken.
Just change the EndedEvent() for this:
Or if you want to "clean" the StatusBar field after a second:
Thank you very much for pointing me this issue, I think I didn't explain it correctly in the help file.
The PositionEvent is fired at the START of every spoken word and gets informed (by its parameter) of the exact character position at the START of that word in the whole text. For example, if you put "The quick brown fox jumps over the lazy dog" in TTSSpeak() then the PositionEvent will fire at the start of "The" informed of the position as 0, then at the start of "quick" informed 4, then at the start of "brown" informed 10, and so. But when the speech arrives to "dog" (the last word), the informed position will be 40, but the whole text is 43 characters long. The PositionEvent function will not fire at the end of the speech, but the EndedEvent will, so you can modify it to put 100% at the statusbar or, instead of retrieve the length of the whole text with Len() and then use it to calculate the position percentage, you can calculate the total length minus the length of the last word. You must remember that the last option will fire a 100% completion when the last word starts to be spoken.
Just change the EndedEvent() for this:
Code: Select all
Procedure EndedEvent() ; And this is the OnAudioStop event function.
StatusBarText(0,0," Finished")
StatusBarText(0,1," Position = 100%")
EndProcedure
Code: Select all
Procedure EndedEvent() ; And this is the OnAudioStop event function.
StatusBarText(0,0," Finished")
StatusBarText(0,1," Position = 100%")
Delay(1000)
StatusBarText(0,1,"")
EndProcedure
Last edited by Esteban1 on Thu Jul 28, 2005 6:09 am, edited 2 times in total.
-
- Enthusiast
- Posts: 781
- Joined: Fri Apr 25, 2003 6:51 pm
- Location: NC, USA
- Contact: