Page 1 of 1

Transcribe a voice message

Posted: Wed Oct 09, 2024 3:36 pm
by dige
Hi folks,

the following code uses an upload to OpenAI/Whisper and show, how the binary data upload works, when form data is required.
The audio file will uploaded and transcribed using the whisper-1 model.
⚠️ an Api key is required (edit "YOURAPIKEY").

Have fun! :D

Code: Select all

curl https://api.openai.com/v1/audio/transcriptions -H "Authorization: Bearer YOURAPIKEY" -H "Content-Type: multipart/form-data" -F file="@.\test.mp3" -F model="whisper-1" -verbose

Code: Select all

; https://platform.openai.com/docs/models/whisper
; https://openai.com/api/pricing/
; Price: Whisper $0.006 / Minute (rounded To the nearest second)

; Dige 10/2024

EnableExplicit

Define.s Boundary$ = "------------------------" + Str(Random(99999999, 10000000))

Define.i File, HttpRequest, PostLen, FileLen, BoundaryLen
Define Filename$, Post1$, Post2$
Define *Buffer, Offset
Define NewMap Header$()


Filename$ = OpenFileRequester("Choose a audio file", "", "All|*.*", 0)
If Filename$
  
  File = ReadFile(#PB_Any, Filename$)
  If File
    FileLen = Lof(File)
    
    Header$("Authorization") = "Bearer YOURAPIKEY"
    Header$("Content-Type")  = "multipart/form-data; boundary=" + Boundary$
    
    Post1$ = #CRLF$
    Post1$ + "--" + Boundary$ + #CRLF$
    Post1$ + ~"Content-Disposition: form-data; name=\"file\"; filename=\"" + GetFilePart(Filename$) + ~"\"" + #CRLF$
    Post1$ + "Content-Type: application/octet-stream" + #CRLF$ + #CRLF$
    
    Post2$ = #CRLF$
    Post2$ + "--" + Boundary$ + #CRLF$
    Post2$ + ~"Content-Disposition: form-Data; name=\"model\"" + #CRLF$ + #CRLF$
    Post2$ + "whisper-1" + #CRLF$
    Post2$ + "--" + Boundary$ + "--" + #CRLF$
    
    PostLen     = StringByteLength(Post1$ + Post2$, #PB_UTF8)
    BoundaryLen = StringByteLength(Boundary$, #PB_UTF8)
    
    *Buffer = AllocateMemory(PostLen + FileLen + 2 + 2 + BoundaryLen + 2 + 2, #PB_Memory_NoClear)
    
    If *Buffer
      PokeS(*Buffer, Post1$, -1, #PB_UTF8|#PB_String_NoZero)
      Offset = StringByteLength(Post1$, #PB_UTF8)
      
      ReadData(File, *Buffer + Offset, FileLen)
      Offset + FileLen
        
      PokeS(*Buffer + Offset, Post2$, -1, #PB_UTF8|#PB_String_NoZero)
      Offset + StringByteLength(Post2$, #PB_UTF8)
      
;       ShowMemoryViewer(*Buffer, MemorySize(*Buffer))
;       CallDebugger
      
      Header$("Content-Length") = Str(Offset)
      
      HttpRequest = HTTPRequestMemory(#PB_HTTP_Post, "https://api.openai.com/v1/audio/transcriptions", *Buffer, Offset, 0, Header$())
        
      If HttpRequest
        Debug "ErrorCode = "+ HTTPInfo(HTTPRequest, #PB_HTTP_ErrorMessage)
        Debug "Response = " + HTTPInfo(HTTPRequest, #PB_HTTP_Response)
        
        FinishHTTP(HttpRequest)
      EndIf
      
      FreeMemory(*Buffer)
    EndIf
    CloseFile(File)
  EndIf
EndIf


Re: Transcribe a voice message

Posted: Wed Oct 09, 2024 4:13 pm
by Quin
Nice! Works well. :)

Re: Transcribe a voice message

Posted: Wed Oct 09, 2024 4:19 pm
by Zapman
Very interesting! Thank's for sharing :)