Debug mode execution speed can vary on source whitespace.
Debug mode execution speed can vary on source whitespace.
[Win 11, X64, PB v6.0+, C & ASM] - Threadsafe enabled as multi-threaded application.
I'm witnessing an oddity whereby simply adding whitespace in front of a procedure that's called with CreateThread() can vary its execution speed by as much as 60% whilst in debug mode.
I can't post the code i'm working with at the moment, but I'll try to find some similarly behaving example code shortly.
I'm witnessing an oddity whereby simply adding whitespace in front of a procedure that's called with CreateThread() can vary its execution speed by as much as 60% whilst in debug mode.
I can't post the code i'm working with at the moment, but I'll try to find some similarly behaving example code shortly.
Re: Debug mode execution speed can vary on source whitespace.
Example code.
Running with v6.10, C compiler, optimizations enabled, threadsafe enabled & debugger enabled:
~43ms per frame when Render_Thread() is on line 50.
~29ms per frame when Render_Thread() is on line 41.
Running with v6.10, C compiler, optimizations enabled, threadsafe enabled & debugger enabled:
~43ms per frame when Render_Thread() is on line 50.
~29ms per frame when Render_Thread() is on line 41.
Code: Select all
;/ Phil James 08/2023 - Pixel shader on CPU - Optimized.
;/ - Taken from 'An introduction to Shader Art Coding': https://www.youtube.com/watch?v=f4s1h2YETNY
EnableExplicit
#MyWin_Main = 0 : #MyGad_GL = 0
Structure RGBf : R.f : G.f : B.f : A.f : EndStructure
Structure Pointf : X.f : Y.f : EndStructure
Global Dim PaletteLUP.rgbf(255) ;/ lookup table for palette
;#SizeX = 160 : #SizeY = 120 : #DrawScale = 8 ; ~1ms on 5800h
;#SizeX = 320 : #SizeY = 240 : #DrawScale = 4 ; ~2ms
#SizeX = 640 : #SizeY = 480 : #DrawScale = 1 ; ~3ms
;#SizeX = 1280 : #SizeY = 960 : #DrawScale = 1 ; ~14ms
;#SizeX = 1600 : #SizeY = 960 : #DrawScale = 1 ; ~17ms
#SizeXM1 = #SizeX - 1 : #SizeYM1 = #SizeY - 1 : #HalfX = #SizeX / 2 : #HalfY = #SizeY / 2 : #Iterations = 3
Structure Thread_Structure
ThreadID.i : Timef.f : Y_Start.l : Y_End.l
EndStructure
Global Event, ThreadCount = CountCPUs(#PB_System_ProcessCPUs)
Global Exit, OnFrame.f, TextureID0, AccuTime
Global Appname.s = "CPU Pixel shader - Optimized - Phil James 10/2023 ["+Str(#SizeX)+" x "+Str(#SizeY)+"]"
Macro ClampM(v, min, max) : If v < min : v = min : ElseIf v > max : v = max : EndIf : EndMacro
Macro FractM(value) : ((value) - Int(Round(value,#PB_Round_Down))) : EndMacro
Macro MinM(a,ia,mn) : If ia < mn : a = ia : Else : a = mn : EndIf : EndMacro
Macro MaxM(a,ia,mx) : If ia > mx : a = ia : Else : a = mx : EndIf : EndMacro
Procedure Palette_CreateLUP()
Protected MyLoop.i
For MyLoop = 0 To 255
PaletteLUP(MyLoop)\R = 0.5 + 0.5 * Cos(6.28318 * (1.0 * (MyLoop / 255.0) + 0.263))
PaletteLUP(MyLoop)\G = 0.5 + 0.5 * Cos(6.28318 * (1.0 * (MyLoop / 255.0) + 0.416))
PaletteLUP(MyLoop)\B = 0.5 + 0.5 * Cos(6.28318 * (1.0 * (MyLoop / 255.0) + 0.557))
Next
EndProcedure
Global Dim Render_ThreadControl.Thread_Structure(ThreadCount)
Global Dim ImageArray_Main.l((#SizeY / 2)-1, (#SizeX / 2) - 1 )
Procedure Render_Thread(*Thread.Thread_Structure)
Protected X, Y, UV.pointf, Dist.f, Dist0.f, Col.rgbf, FinalColour.rgbf, MyLoop.i, Scale.f = 1.5, Dist1.f, Exp.f, DistL.l
For Y = *Thread\Y_Start To *Thread\y_end
For X = 0 To (#SizeX / 2) - 1
UV\x = (x * 2.0 - #SizeX) / #SizeY : UV\y = (y * 2.0 - #SizeY) / #SizeY
FinalColour\R = 0 : FinalColour\G = 0 : FinalColour\B = 0
Dist0 = Sqr((UV\X*UV\X) + (UV\Y * UV\Y))
Exp = Exp(-Dist0)
For MyLoop = 0 To #Iterations
UV\x = Fractm(UV\x * Scale) - 0.5 : UV\y = Fractm(UV\y * Scale) - 0.5
DistL = Fractm(Dist0 + MyLoop * 0.4 + *Thread\Timef * 0.4) * 255.0
Dist = Pow(0.01 / (Abs(Sin((Sqr((UV\X * UV\X) + (UV\Y * UV\Y)) * Exp) * 8.0 + *Thread\Timef) * 0.125)+0.00001), 1.2)
ClampM(Dist,0,1)
ClampM(DistL,0,255)
FinalColour\R + PaletteLUP(DistL)\R * dist : FinalColour\G + PaletteLUP(DistL)\G * dist : FinalColour\B + PaletteLUP(DistL)\B * dist
Next
ClampM(FinalColour\R,0,1) : ClampM(FinalColour\g,0,1) : ClampM(FinalColour\b,0,1)
ImageArray_Main(y,x) = RGB(FinalColour\r * 255.0,FinalColour\G * 255.0,FinalColour\b * 255.0)
Next
Next
EndProcedure
Procedure Render_Main()
Protected MyLoop, FPS.f, Time.f
Time = ElapsedMilliseconds()
For MyLoop = 1 To ThreadCount ; create the threads
Render_ThreadControl(MyLoop)\Timef = Time / 1000.0
Render_ThreadControl(MyLoop)\ThreadID = CreateThread(@Render_Thread(),@Render_ThreadControl(Myloop))
Next
For Myloop = 1 To ThreadCount : WaitThread(Render_ThreadControl(MyLoop)\ThreadID) : Next ; wait for thread finish.
glClearColor_(1.0,0.0,0.0,1.0) : glClear_(#GL_COLOR_BUFFER_BIT)
glTexSubImage2D_(#GL_TEXTURE_2D, 0, 0, 0, #SizeX/2, #SizeY/2, #GL_RGBA, #GL_UNSIGNED_BYTE, @ImageArray_Main()) ;/ update texture from array
;/ render as quad
glBegin_(#GL_QUADS)
glTexCoord2f_(0,0) : glVertex2f_(0,0) : glTexCoord2f_(1,0) : glVertex2f_(0.5,0) : glTexCoord2f_(1,1) : glVertex2f_(0.5,0.5) : glTexCoord2f_(0,1) : glVertex2f_(0,0.5)
glTexCoord2f_(0,1) : glVertex2f_(0,0.5) : glTexCoord2f_(1,1) : glVertex2f_(0.5,0.5) : glTexCoord2f_(1,0) : glVertex2f_(0.5,1.0) : glTexCoord2f_(0,0) : glVertex2f_(0,1.0)
glTexCoord2f_(1,1) : glVertex2f_(0.5,0.5) : glTexCoord2f_(0,1) : glVertex2f_(1.0,0.5) : glTexCoord2f_(0,0) : glVertex2f_(1.0,1.0) : glTexCoord2f_(1,0) : glVertex2f_(0.5,1.0)
glTexCoord2f_(1,0) : glVertex2f_(0.5,0) : glTexCoord2f_(0,0) : glVertex2f_(1.0,0) : glTexCoord2f_(0,1) : glVertex2f_(1.0,0.5) : glTexCoord2f_(1,1) : glVertex2f_(0.5,0.5)
glEnd_()
SetGadgetAttribute(#MyGad_GL,#PB_OpenGL_FlipBuffers,#True)
AccuTime + (ElapsedMilliseconds() - time) : Onframe + 1
EndProcedure
Procedure Init_Main()
Protected MyLoop.i, StrideY.i = Round(((#SizeY/2)-1) / ThreadCount,#PB_Round_Up), Y.i = 0
OpenWindow(#MyWin_Main,0,0,#SizeX * #DrawScale,#SizeY * #DrawScale,"",#PB_Window_ScreenCentered|#PB_Window_SystemMenu)
OpenGLGadget(#MyGad_GL,0,0,#SizeX * #DrawScale,#SizeY * #DrawScale,#PB_OpenGL_NoFlipSynchronization)
Palette_CreateLUP()
For MyLoop = 1 To ThreadCount
Render_ThreadControl(Myloop)\Y_Start = Y : Y + StrideY
Render_ThreadControl(Myloop)\Y_End = Y - 1
If Render_ThreadControl(Myloop)\Y_End > (#SizeY/2)-1 : Render_ThreadControl(Myloop)\Y_End = (#SizeY/2)-1 : EndIf
Next
glMatrixMode_(#GL_PROJECTION) : glLoadIdentity_() : glOrtho_(0,1,1,0,0,1)
glGenTextures_(1, @TextureID0) : glBindTexture_(#GL_TEXTURE_2D, TextureID0) : glEnable_(#GL_TEXTURE_2D)
glTexParameteri_(#GL_TEXTURE_2D, #GL_TEXTURE_MAG_FILTER, #GL_NEAREST)
glTexParameteri_(#GL_TEXTURE_2D, #GL_TEXTURE_MIN_FILTER, #GL_NEAREST)
glTexImage2D_(#GL_TEXTURE_2D, 0, #GL_RGB, #SizeX/2, #SizeY/2, 0, #GL_RGBA, #GL_UNSIGNED_BYTE, @ImageArray_Main());
AddWindowTimer(0,0,1000)
EndProcedure
Init_Main()
Repeat
Repeat
Event = WindowEvent()
If Event = #PB_Event_CloseWindow : Exit = #True : EndIf
If Event = #PB_Event_Timer : SetWindowTitle(0,Appname + " - "+StrF(AccuTime / Onframe,2)+"ms") : EndIf
Until Event = 0
Render_Main()
Until Exit = #True
glDeleteTextures_(1,@TextureID0)
Last edited by pjay on Mon Apr 15, 2024 7:39 am, edited 1 time in total.
Re: Debug mode execution speed can vary on source whitespace.
Confirmed here: 150ms with no blank lines above the procedure, and 220ms when there is.
When DisableDebugger is used, both are 25ms.
Never seen how the source code itself can affect the runtime of a compiled exe. Although... I have had PureBasic's IDE not open until I quit a running exe that I made with PureBasic (that wasn't running from the IDE), but that's a bug report for another day because I can't prove it. Maybe related.
Never seen how the source code itself can affect the runtime of a compiled exe. Although... I have had PureBasic's IDE not open until I quit a running exe that I made with PureBasic (that wasn't running from the IDE), but that's a bug report for another day because I can't prove it. Maybe related.
Re: Debug mode execution speed can vary on source whitespace.
Compile exe without debugger and report.
The nice thing about standards is there are so many to choose from. ~ Andrew Tanenbaum
Re: Debug mode execution speed can vary on source whitespace.
[Win 10, X64, PB v6.10, C] Not confirmed: both are ~920-1040ms with debugger, and ~11ms without Debugger.
- DeanH
- Enthusiast

- Posts: 280
- Joined: Wed May 07, 2008 4:57 am
- Location: Adelaide, South Australia
- Contact:
Re: Debug mode execution speed can vary on source whitespace.
[Win 10, PB 6.10. CPU Intel 7700k @ 4.3Ghz, 8 cores]
Not confirmed.
Debugger on
Run from the IDE, not compiled
Every run returned a slightly different average time.
Line 41.
x64 ASM compiler, 350-410ms.
x64 C compiler, 450-530ms.
I changed the gap to line 141.
x64 ASM compiler, 350-414ms.
x64 C compiler, 450-529ms.
I found similar times with the x86 compilers and also with PB 6.01.
Without the debugger, approx 7ms every time. Same if compiled.
I did not see a significant difference in speed even with a large number of blank lines.
I did see a slight difference between using the ASM and C compilers with the debugger on.
Not confirmed.
Debugger on
Run from the IDE, not compiled
Every run returned a slightly different average time.
Line 41.
x64 ASM compiler, 350-410ms.
x64 C compiler, 450-530ms.
I changed the gap to line 141.
x64 ASM compiler, 350-414ms.
x64 C compiler, 450-529ms.
I found similar times with the x86 compilers and also with PB 6.01.
Without the debugger, approx 7ms every time. Same if compiled.
I did not see a significant difference in speed even with a large number of blank lines.
I did see a slight difference between using the ASM and C compilers with the debugger on.
Re: Debug mode execution speed can vary on source whitespace.
Thanks for testing - It may be that it's system architecture dependant; I'm on an AMD laptop here (5800h, Windows 11 x64)
I appreciate that it doesn't matter a great deal, as it only appears to influence the debugging side of things, but it could be beneficial to understand what’s causing it.
I appreciate that it doesn't matter a great deal, as it only appears to influence the debugging side of things, but it could be beneficial to understand what’s causing it.
Re: Debug mode execution speed can vary on source whitespace.
I am actually wondering how and why white spaces should affect the execution speed at all. I could imagine that compiling takes nuances longer, but then? Isn't the out the same, regardless of how many white spaces I put there?
Good morning, that's a nice tnetennba!
PureBasic 6.21/Windows 11 x64/Ryzen 7900X/32GB RAM/3TB SSD
Synology DS1821+/DX517, 130.9TB+50.8TB+2TB SSD
PureBasic 6.21/Windows 11 x64/Ryzen 7900X/32GB RAM/3TB SSD
Synology DS1821+/DX517, 130.9TB+50.8TB+2TB SSD
Re: Debug mode execution speed can vary on source whitespace.
This code just straight up crashes on my AMD Ryzen 5950x 
Re: Debug mode execution speed can vary on source whitespace.
You can compare the asm output with --commented to see if there is anything odds. Could be an extra debuggercheck()
Re: Debug mode execution speed can vary on source whitespace.
Did this for both asm and c output and didn't see any additional calls.Fred wrote: Sun Apr 14, 2024 3:48 pm You can compare the asm output with --commented to see if there is anything odds. Could be an extra debuggercheck()
I think the issue may be linked to the debugger interaction with the IDE, as I re-ran the executables generated during the above process (which used the console debugger instead), and they both performed identically.
Re: Debug mode execution speed can vary on source whitespace.
Runs fine on my AMD Ryzen 5900x
C compiler 8.98ms
ASM compiler 13.10ms
whitespace made no difference
- It was too lonely at the top.
System : PB 6.21(x64) and Win 11 Pro (x64)
Hardware: AMD Ryzen 9 5900X w/64 gigs Ram, AMD RX 6950 XT Graphics w/16gigs Mem
System : PB 6.21(x64) and Win 11 Pro (x64)
Hardware: AMD Ryzen 9 5900X w/64 gigs Ram, AMD RX 6950 XT Graphics w/16gigs Mem
- Mindphazer
- Enthusiast

- Posts: 487
- Joined: Mon Sep 10, 2012 10:41 am
- Location: Savoie
Re: Debug mode execution speed can vary on source whitespace.
On my Mac (M1), with the blank lines and without the blank lines, there is no difference :
- 51 ms debugger on
- 8 ms debugger off.
But on my Windows 10 at work, there is a difference of 12-15 ms with and without lines (debugger on)
Not difference with debugger off
- 51 ms debugger on
- 8 ms debugger off.
But on my Windows 10 at work, there is a difference of 12-15 ms with and without lines (debugger on)
Not difference with debugger off
MacBook Pro 16" M4 Pro - 24 Gb - MacOS 15.4.1 - Iphone 15 Pro Max - iPad at home
...and unfortunately... Windows at work...
...and unfortunately... Windows at work...


