It is surprising to me that threadsafe option makes it faster. Here are my results with this code:NicTheQuick wrote:With macros and the avoidance of modulo you actually can make it pretty fast. And that is what a C compiler usually does automatically for you:Code: Select all
PB 5.73 x64 without debugger:
threadsafe disabled:
Passes: 1585, Time: 5001 ms, Avg: 3.155 ms, Limit: 1000000, Count: 78498, Valid: 1
threadsafe enabled:
Passes: 2176, Time: 5001 ms, Avg: 2.298 ms, Limit: 1000000, Count: 78498, Valid: 1
Also tested it with MSVC release mode /O2 on the same PC:
x86:
Passes: 4898, Time: 5.000000, Avg: 0.001021, Limit: 1000000, Count: 78498, Valid: 1
x64:
Passes: 5049, Time: 5.000000, Avg: 0.000990, Limit: 1000000, Count: 78498, Valid: 1