PureSoX (audio synthesis, editing and augmentation)
Posted: Sun May 02, 2021 10:52 am
SoX is a command-line utility for augmenting and synthesising audio files. This is a library to simplify using it from PB.
PureSoX.pbi
Envelope.pbi (required)
WavFiles.pbi (required)
It requires the SoX exe and its add-ons. These are completely portable and could be included with your app.
Demo:
Every PureSoX procedure call uses the output of the previous call as its basis, thus we constantly move forward in "stages". It is therefore very easy to apply effects in sequence:
To explain what is happening:
Another demo - generate a sweeping sine tone:
I am aware that the library does not combine effects into single SoX calls, and that doing so would reduce the need for temp files and even cut some latency time. The goal was not to have the most efficient code, but to make programming with SoX as simple as possible.
But also, working this way enables quite complex operations. Each stage produces a temp file with a unique ID, which is returned by each procedure, so that we can use that version of the audio again later on. For example:
Explanation:
In this way, complex structured effects can be created.
Functions for handling the structure:
Envelope point times should be in (not necessarily whole) seconds. The final point's time is taken as the duration of the envelope. There are three types of envelope scaling: native (unchanged), stretch to duration of audio, and looped.
For the volume procedures, the input values should be in the range 0-1.
For Pan and PanEnvelope, they should range from -1 (100% left) to 1 (100% right).
The rest of the library is ports to SoX's various effects. I will add more of these as and when I need them, but already the following are implemented:
This example performs four bends of the pitch, delimited with the | symbol. Each bend is described with start time, end time, and cents change - these three fields being delimited with the ~ symbol. The times are in continuous form, whereas SoX takes them in relative form (this is taken care of by PureSoX). Pitch changes are specified in cents (100ths of a semitone).
Some usage notes...
All temp files will be deleted by calling SoX_ClearCache.
SoX_Import converts an input file into the standard operating format: stereo, 16-bit, 48khz. Everything is done in that format, but SoX_Export has a flag to export in mono.
Effects procedures work from the output of the previous stage. Synthesis/import procedures do not - these include SoX_Import, SoX_Synthesize, SoX_Silence, SoX_Concatenate, SoX_GenerateSoundtrack and SoX_Morse.
To revert to a previous stage at any time, use SoX_RevertTo.
SoX_Silence is not a port to SoX's own "silence" function. It simply creates a silent wave file of the specified length.
There are also some miscellaneous functions:
SoX_GenerateSoundtrack essentially answers my own question from 9 years ago. It simplifies the scheduled mixing of multiple audio files. As above, format conformance is taken care of.
One final note...
I have also included a ring modulator effect. This isn't native to SoX but is made possible by combining several of its functions. I actually discovered it by accident while researching. I hope that people create more such effects for this library, using the building blocks that it provides. Direct access to the wave data makes all sorts of stuff possible (see the ApplyStereoEnvelope procedure for a clear example of how it's done). But that quickly gets into areas of maths etc. that I am not good at, so I won't be attempting it myself. But if other people do this, please share your effects here. Some things I would like to see are distortion, bitcrushing, waveshaping, stereo width adjustment, filter sweeping, and perhaps even vocoding.
This library has been a pleasure to work on and I am confident that it will be useful to other people. If anyone finds bugs, please let me know.
PureSoX.pbi
Envelope.pbi (required)
WavFiles.pbi (required)
It requires the SoX exe and its add-ons. These are completely portable and could be included with your app.
Demo:
Code: Select all
XIncludeFile "PureSoX.pbi"
; first, specify the location of the SoX utilities and a temp files folder
InitSoX("C:\Program Files (x86)\sox-14-4-2\","G:\my_sox_temp_folder\")
SoX_Morse("PB")
SoX_Overdrive(10,15)
bare.s = Sox_LowPassFilter(1000)
SoX_Reverb(100,10,100,100,0.5,-6,#True)
SoX_ChorusSimple(25,0.5,2.25,2,#SoxWave_Triangle)
SoX_Flanger(3,2,-50,Random(100,50),0.5,#SoXWave_Sine,Random(100,5))
Sox_LowPassFilter(2500)
SoX_Normalize()
SoX_MixWith(0.5,bare,0.35)
morsefn.s = folder+"morse_concat.wav"
SoX_Export(morsefn)
SoX_ClearCache()
RunProgram(morsefn)
Every PureSoX procedure call uses the output of the previous call as its basis, thus we constantly move forward in "stages". It is therefore very easy to apply effects in sequence:
Code: Select all
SoX_Import("E:\my_input_sound.wav")
SoX_ChorusSimple(5,0.5,1,3,#SoxWave_Triangle)
SoX_Normalize(-3)
SoX_Export("E:\my_output_sound.wav")
Code: Select all
; conform the input file --> TempFile#1
; apply chorus effect to TempFile#1 --> TempFile#2
; normalize TempFile#2 to -3dB --> TempFile#3
; copy TempFile#3 to the specified location
Code: Select all
SoX_Synthesize(#SoxWave_Sine,4,200,800)
SoX_FadeOut(10)
SoX_Export("E:\sine-sweep.wav")
But also, working this way enables quite complex operations. Each stage produces a temp file with a unique ID, which is returned by each procedure, so that we can use that version of the audio again later on. For example:
Code: Select all
original.s = SoX_Import("E:\my_input_sound.wav")
SoX_Reverb(100,25,100,100,0,0,#True)
chorus.s = SoX_ChorusSimple(5,0.5,1,3,#SoxWave_Sine)
SoX_RevertTo(original)
SoX_FadeOut(3500)
SoX_MixWith(0.5,chorus,0.5)
SoX_Export("E:\my_output_sound.wav")
Code: Select all
; conform the input audio --> TempFile#1
; apply reverb effect to TempFile#1, outputting ONLY the reverb --> TempFile#2
; apply chorus effect to TempFile#2 --> TempFile#3
; we're now working from #TempFile1 again
; fade out #TempFile1 over 3500ms --> #TempFile4
; mix TempFile4 with TempFile#3 --> TempFile#5
; copy TempFile#5 to the specified location
Functions for handling the structure:
Functions for truncating audio operate exactly like their PB equivalents for text:SoX_Copy ; creates an identical copy of the current stage (or a previous one, if specified)
SoX_RevertTo ; works from a previous stage
Functions for extending audio (with silence):SoX_Left
SoX_Mid
SoX_Right
SoX_Trim ; will trim silence from start and end
SoX_LTrim ; will trim silence from start
SoX_RTrim ; will trim silence from end
Functions for altering volume:SoX_PadStart
SoX_PadEnd
SoX_Pad ; both at once
Functions for working with channels:FadeIn
FadeOut
Normalize
VolumeDecimal
VolumeDecibel
VolumeEnvelope
VolumeEnvelopeStereo ; separate, per-channel control
Mute
MuteLeft
MuteRight
A surprising omission from SoX is the ability to modulate volume over time, except for fade-in at the start of a file and fade-out at the end. This is a glaring omission which makes many obvious audio tasks impossible. Using code from BasicallyPure, I wrote procedures to operate on the wave data directly. This yielded the procedures VolumeEnvelope and VolumeEnvelopeStereo. Using the latter, a third procedure was developed, PanEnvelope, and then Pan and several others. All of these procedures are integrated into the standard effects system.Pan ; static pan position
PanEnvelope ; animated panning
SwapLeftRight
MirrorLeft
MirrorRight
Envelope point times should be in (not necessarily whole) seconds. The final point's time is taken as the duration of the envelope. There are three types of envelope scaling: native (unchanged), stretch to duration of audio, and looped.
For the volume procedures, the input values should be in the range 0-1.
For Pan and PanEnvelope, they should range from -1 (100% left) to 1 (100% right).
Code: Select all
SoX_Synthesize(#SoxWave_BrownNoise,8)
;SoX_Pan(-0.5)
SoX_PanEnvelope("0~0|1~-0.5|2~-1|6~1|8~0|")
SoX_VolumeEnvelope("0~0|1~0.5|7~1|8~0|")
SoX_Export("E:\brownnoise-panned.wav")
A note on how to use the pitch-bend effect:ShiftSpeed
ShiftPitch
ShiftSpeedAndPitch
BendPitch
Stretch
Overdrive
Reverb
Echos
Chorus (simplified version)
Flanger
Phaser
Reverse
Downsample
Code: Select all
SoX_BendPitch("0~1~200|2~3~-400|4~5~600|6~7~-400|")
Some usage notes...
All temp files will be deleted by calling SoX_ClearCache.
SoX_Import converts an input file into the standard operating format: stereo, 16-bit, 48khz. Everything is done in that format, but SoX_Export has a flag to export in mono.
Effects procedures work from the output of the previous stage. Synthesis/import procedures do not - these include SoX_Import, SoX_Synthesize, SoX_Silence, SoX_Concatenate, SoX_GenerateSoundtrack and SoX_Morse.
To revert to a previous stage at any time, use SoX_RevertTo.
SoX_Silence is not a port to SoX's own "silence" function. It simply creates a silent wave file of the specified length.
There are also some miscellaneous functions:
SoX_Concatenate does what you would expect. All files are automatically converted to the standard format.SoX_LoopTempo ; loops the current sound at a given tempo
SoX_CurrentDuration ; duration of the current stage
SoX_StageDuration ; duration of a previous stage
SoX_Morse ; generates Morse code -> new current stage
SoX_WobblePitch
SoX_Concatenate
SoX_GenerateSoundtrack
GetWavFileDuration
GetWavFileSampleRate
GetWavFileBitDepth
GetWavFileChannels
SoX_GenerateSoundtrack essentially answers my own question from 9 years ago. It simplifies the scheduled mixing of multiple audio files. As above, format conformance is taken care of.
Code: Select all
Dim smix.SoxSoundtrackStructure(3)
smix(1)\fn = folder+"bass drum.wav"
smix(1)\start_secs = 10.25
smix(1)\volume_dec = 0.2
smix(2)\fn = folder+"cymbal.wav"
smix(2)\start_secs = 0
smix(2)\volume_dec = 1
smix(3)\fn = folder+"top hat.wav"
smix(3)\start_secs = 7.3
smix(3)\volume_dec = 0.6
SoX_GenerateSoundtrack(smix())
SoX_Export(folder+"mix.wav")
I have also included a ring modulator effect. This isn't native to SoX but is made possible by combining several of its functions. I actually discovered it by accident while researching. I hope that people create more such effects for this library, using the building blocks that it provides. Direct access to the wave data makes all sorts of stuff possible (see the ApplyStereoEnvelope procedure for a clear example of how it's done). But that quickly gets into areas of maths etc. that I am not good at, so I won't be attempting it myself. But if other people do this, please share your effects here. Some things I would like to see are distortion, bitcrushing, waveshaping, stereo width adjustment, filter sweeping, and perhaps even vocoding.
This library has been a pleasure to work on and I am confident that it will be useful to other people. If anyone finds bugs, please let me know.