Page 1 of 1

@PureBasicTeam! Why that crazy and not needed code overhad when adding 2 Floats in a Procedure

Posted: Fri Aug 02, 2024 10:42 am
by SMaag
I just did a test how PB adds 2 Floats in Assembler Code! I made a simpe Procedure which adds 2 floats.
I was surprised about the Assembler-Code result. There is a Code overhad of susupicous XMM Register Backups.

To see wat PB do with Integer adds, I tried same with adding 2 Int

Same phenomen: Backing up 2 Registers rcx, rdx which are not used!

Especally the Backup of the XMM Registers in the float version do not make sense because
FloatingPoint Registers shared with MMX-Register not with XMM-Registers.

I guess this a 'Bug' of the Compiler! Not detected in tests, because it do not affect the function!

Code: Select all


; Why PB6.11 do such Code-Overhad when adding 2 Floats??

Define.f a,b,c, d

a= 1.0
b =1.5

c = a + b

Procedure.f PB_AddF(In1.f, In2.f)
  ProcedureReturn In1 + In2  
EndProcedure

; ---------------------------------------
; ASM OutPut PB6.11 LTS of PB_AddF
; ---------------------------------------
; ; Procedure.f PB_AddF(In1.f, In2.f)
; _Procedure0:

; Why do a Backup of XMM-Registers?
; MMX Register I could understand because they are shared with FloatingPoint Regs
;   MOVSS  dword [rsp+8],xmm0
;   MOVSS  dword [rsp+16],xmm1

;   PS0=48
;   SUB    rsp,40
; ; ProcedureReturn In1 + In2  
;   FLD    dword [rsp+PS0+0]
;   FADD   dword [rsp+PS0+8]
;   FADD   dword [F1]
;   MOVSXD rax,eax
;   JMP   _EndProcedure1
; ; EndProcedure
; _EndProcedureZero1:
;   FLDZ

; _EndProcedure1:
;   ADD    rsp,40
;   FSTP   dword [rsp-8]

;   Why only restor xmm0 and backup xmm0 and xmm1
;   MOVSS  xmm0,[rsp-8]
;   RET

Procedure.i PB_AddI(In1.i, In2.i)
  ProcedureReturn In1 + In2  
EndProcedure

; ---------------------------------------
; ASM OutPut PB6.11 LTS of PB_AddI
; ---------------------------------------

; Procedure.i PB_AddI(In1.i, In2.i)
; _Procedure2:

; Here similar to the float version, but Backlup 2 unused Registers rcx, rdx
;   MOV    qword [rsp+8],rcx
;   MOV    qword [rsp+16],rdx

;   PUSH   r15
;   PS2=64
;   XOr    rax,rax
;   PUSH   rax
;   SUB    rsp,40
; ; ProcedureReturn In1 + In2  
;   MOV    r15,qword [rsp+PS2+0]
;   ADD    r15,qword [rsp+PS2+8]
;   MOV    rax,r15
;   JMP   _EndProcedure3
; ; EndProcedure
; _EndProcedureZero3:
;   XOr    rax,rax
; _EndProcedure3:
;   ADD    rsp,48
;   POP    r15
;   RET
  
Procedure.f ASM_AddF(In1.f, In2.f)
  !FLD    dword [p.v_In1]
  !FADD   dword [p.v_In2]
  !FADD   dword [F1]
  !MOVSXD rax,eax
  ProcedureReturn  
EndProcedure


c=PB_AddF(a,b)
Debug "Result PB_AddF = " + c

c= ASM_AddF(a,b)
Debug "Result ASM_AddF = " + c

d= PB_AddI(1,5)

Re: @PureBasicTeam! Why that crazy and not needed code overhad when adding 2 Floats in a Procedure

Posted: Fri Aug 02, 2024 10:53 am
by Fred
PB is a generic and 'naive' ASM output code which is built for very fast compiling while retaining good code speed. You have now the choice to use the C backend for optimized build which should generate much faster code.

Re: @PureBasicTeam! Why that crazy and not needed code overhad when adding 2 Floats in a Procedure

Posted: Fri Aug 02, 2024 11:42 am
by SMaag
Fred wrote: Fri Aug 02, 2024 10:53 am PB is a generic and 'naive' ASM output code which is built for very fast compiling while retaining good code speed. You have now the choice to use the C backend for optimized build which should generate much faster code.
Thanks for answering! Yes I know I can use the C-Backend to build faster code!

I just finished deeper check with IDA. Using the XMM Register is a kind of FastCall where the Parameters transferd in Registers XMM0 and XMM1 and not directly over the Stack like in StandardCall. Same for the RCX and RDX Register - a kind of FastCall

First it looked like a Bug because of using XMM-Register! My opinion was if XMM was wrong and should be MMX then it will come to a problem when the Register MOV is really needed.

Normally a FastCall is done faster then a StandardCall. So it looks like a Speed optimation! Nice!

Re: @PureBasicTeam! Why that crazy and not needed code overhad when adding 2 Floats in a Procedure

Posted: Fri Aug 02, 2024 3:05 pm
by wilbert
SMaag wrote: Fri Aug 02, 2024 11:42 am Using the XMM Register is a kind of FastCall where the Parameters transferd in Registers XMM0 and XMM1 and not directly over the Stack like in StandardCall.
It is normal for x64 to use registers.
Take a look at the x64 calling conventions
https://en.wikipedia.org/wiki/X86_calling_conventions