this code is a nothingburger, not as fast as I hoped for, but still around 2x to 3x faster than using modulo % operator. When you have a divisor of power of 2, you can use an & operation with a mask. Modulo is said to be a slow operation, but it's not that slow as I imagined (at least in these tests). If you do millions of those specific modulo operations maybe this could be useful, tho; but it only works with 2^n divisors, with a mask = (2^n)-1.
In this "demo" there's a procedure and two macros, and a PB % operation version. All codes do the same: calculate the remainder, and if there's a remainder (non zero) they calculate the "padding" that should be added (divisor - remainder).
Code: Select all
; BinModOptimized.pb
; ------------------
;
; optimized modulo operations for divisors that are ALWAYS power of 2 (2,4,8 ... 1024 ... etc)
;
;
;
Procedure BinMod(value, divi)
; Debug "value: "+value
; Debug value % divi
; Debug value & (divi-1)
; If value & (divi-1)
; Debug "Padding:" + Str(divi - ((value & (divi-1))))
; Else
; Debug "no padding necessary"
; EndIf
; Debug "-----------------"
ProcedureReturn value & (divi-1)
EndProcedure
Macro M_binMod(_result_, _value_, _divisor_, _paddingresult_)
_result_ = _value_ & (_divisor_ - 1)
If _result_
_paddingresult_ = _divisor_ - _result_
Else
_paddingresult_ = 0
EndIf
EndMacro
Macro M_binMod2(_result_, _value_, _divisor_ , _divisormask_, _paddingresult_)
_result_ = _value_ & _divisormask_
If _result_
_paddingresult_ = (_divisor_) - _result_
Else
_paddingresult_ = 0
EndIf
EndMacro
#MAX_LOOPS = 1000 * 1000 * 100 ; 100 M
#MAX_RANDOM = 1234567890
#DIVISOR = 4096
#DIVISOR_MASK = #DIVISOR - 1
Define _rest, _padding
Define s1,e1
s1 = ElapsedMilliseconds()
For i=1 To #MAX_LOOPS
_rest = BinMod(Random(#MAX_RANDOM),#DIVISOR)
If _rest
_padding = #DIVISOR - _rest
Else
_padding = 0
EndIf
Next
e1 = ElapsedMilliseconds()
Define s2,e2
s2 = ElapsedMilliseconds()
For i=1 To #MAX_LOOPS
M_binMod(_rest, Random(#MAX_RANDOM), #DIVISOR, _padding)
Next
e2 = ElapsedMilliseconds()
Define s3,e3,_div_var
_div_var = #DIVISOR
s3 = ElapsedMilliseconds()
For i=1 To #MAX_LOOPS
_rest = Random(#MAX_RANDOM) % _div_var
If _rest
_padding = #DIVISOR - rest
Else
_padding = 0
EndIf
Next
e3 = ElapsedMilliseconds()
Define s4,e4
s4 = ElapsedMilliseconds()
For i=1 To #MAX_LOOPS
M_binMod2(_rest, Random(#MAX_RANDOM), #DIVISOR, #DIVISOR_MASK, _padding)
Next
e4 = ElapsedMilliseconds()
msg$="Test results"+#CRLF$
msg$+"Number of loops: "+FormatNumber(#MAX_LOOPS,0)+#CRLF$
msg$+"BinMod() function exec time ms: "+FormatNumber(e1-s1,0)+#CRLF$
msg$+ "M_binMod() macro exec time ms: "+FormatNumber(e2-s2,0)+#CRLF$
msg$+ "M_binMod2() macro exec time ms: "+FormatNumber(e4-s4,0)+#CRLF$
msg$+ "PureBasic % operator exec time ms: "+FormatNumber(e3-s3,0)+#CRLF$
msg$+" "+#CRLF$
msg$+"PureBasic "+FormatNumber(#PB_Compiler_Version/100,2)+" "
CompilerSelect #PB_Compiler_OS
CompilerCase #PB_OS_Windows
msg$+"Windows"
CompilerCase #PB_OS_Linux
msg$+"Linux"
CompilerCase #PB_OS_AmigaOS
msg$+"AmigaOS"
CompilerCase #PB_OS_MacOS
msg$+"MacOS"
CompilerCase #PB_OS_Web
msg$+"Web"
CompilerDefault
msg$+"UNKNOWN-OPERATING-SYSTEM"
CompilerEndSelect
msg$+" "
CompilerSelect #PB_Compiler_Processor
CompilerCase #PB_Processor_Arm32
msg$+"arm-32"
CompilerCase #PB_Processor_Arm64
msg$+"arm-64"
CompilerCase #PB_Processor_JavaScript
msg$+"JavaScript"
CompilerCase #PB_Processor_mc68000
msg$+"mc68000"
CompilerCase #PB_Processor_PowerPC
msg$+"PowerPC"
CompilerCase #PB_Processor_x64
msg$+"x64"
CompilerCase #PB_Processor_x86
msg$+"x86"
CompilerDefault
msg$+"UNKNOWN-PROCESSOR"
CompilerEndSelect
CompilerIf #PB_Backend_Asm = #PB_Compiler_Backend
msg$+" (ASM Backend"
CompilerElse
msg$+" (C Backend"
CompilerEndIf
CompilerIf #PB_Compiler_Optimizer
msg$+" with optimizer)"
CompilerElse
msg$+")"
CompilerEndIf
msg$+#CRLF$
SetClipboardText(msg$)
MessageRequester("BinMod test results",msg$)
Some of my results. I haven't checked how well it works on Linux+Raspberry Pi, yet...
Code: Select all
Test results
Number of loops: 100,000,000
CheckBinMod() function exec time ms: 611
M_binMod() macro exec time ms: 396
M_binMod2() macro exec time ms: 396
PureBasic % operator exec time ms: 1,177
PureBasic 6.04 Windows x64 (ASM Backend)
Code: Select all
Test results
Number of loops: 100,000,000
BinMod() function exec time ms: 340
M_binMod() macro exec time ms: 342
M_binMod2() macro exec time ms: 340
PureBasic % operator exec time ms: 516
PureBasic 6.04 Windows x86 (C Backend with optimizer)
Code: Select all
Test results
Number of loops: 100,000,000
BinMod() function exec time ms: 359
M_binMod() macro exec time ms: 354
M_binMod2() macro exec time ms: 353
PureBasic % operator exec time ms: 1,058
PureBasic 6.10 Windows x64 (C Backend with optimizer)