Shifting bits in an arbitrary number of bytes
Re: Shifting bits in an arbitrary number of bytes
I found it, I had to dec ecx immediately before the jump condition test, because of the zero flag I'm assuming. Well encryption is already 80% faster with just these simple changes, I don't know if I need to work on more optimisations since RSA is falling out of use to ECC these days.
Re: Shifting bits in an arbitrary number of bytes
A faster implementation of the two procedures ...
Code: Select all
Procedure ShiftLeftBuffer(*data, length.l)
!xor eax, eax
!mov ecx, [p.v_length]
!dec ecx
!jc slb_exit
CompilerIf #PB_Compiler_Processor = #PB_Processor_x64
!mov rdx, [p.p_data]
!slb_loop:
!mov ah, [rdx + rcx]
!shl eax, 1
!mov [rdx + rcx], ah
CompilerElse
!mov edx, [p.p_data]
!slb_loop:
!mov ah, [edx + ecx]
!shl eax, 1
!mov [edx + ecx], ah
CompilerEndIf
!shr eax, 9
!sub ecx, 1
!jnc slb_loop
!slb_exit:
!shr eax, 7
ProcedureReturn
EndProcedure
Procedure ShiftRightBuffer(*data, length.l)
!xor eax, eax
!mov ecx, [p.v_length]
!and ecx, ecx
!jz srb_exit
CompilerIf #PB_Compiler_Processor = #PB_Processor_x64
!mov rdx, [p.p_data]
!srb_loop:
!mov ah, [rdx]
!shr eax, 1
!mov [rdx], ah
!inc rdx
CompilerElse
!mov edx, [p.p_data]
!srb_loop:
!mov ah, [edx]
!shr eax, 1
!mov [edx], ah
!inc edx
CompilerEndIf
!shl eax, 9
!dec ecx
!jnz srb_loop
!srb_exit:
!shr eax, 16
!and eax, 1
ProcedureReturn
EndProcedure
Windows (x64)
Raspberry Pi OS (Arm64)
Raspberry Pi OS (Arm64)
Re: Shifting bits in an arbitrary number of bytes
Thanks I have been doing some testing and noticed it doesn't help with my large number module any more, so I will have to optimise some more. I've noticed that all the time is spent on this:
so looks like I'll have to delve into assembly again
Code: Select all
While i
i=i-1
If j
j=j-1
sum = PeekA(*h1\num+i) + PeekA(*h2\num+j) + carry
Else
sum = PeekA(*h1\num+i) + carry
EndIf
If sum > $FF : carry = 1 : Else : carry = 0 : EndIf
PokeA(*h1\num+i, sum)
Wend
Re: Shifting bits in an arbitrary number of bytes
You could already speed this up a lot without using asm.coco2 wrote:Thanks I have been doing some testing and noticed it doesn't help with my large number module any more, so I will have to optimise some more. I've noticed that all the time is spent on this:
I wanted to try something with you huge.pbi code to improve the speed but your code depends on cc2debug.pbi and it looks like you didn't post that code.
Windows (x64)
Raspberry Pi OS (Arm64)
Raspberry Pi OS (Arm64)
Re: Shifting bits in an arbitrary number of bytes
This one?
Code: Select all
Macro DebugOut(DebugText)
Debug #PB_Compiler_Module+"::"+#PB_Compiler_Procedure+": "+DebugText
EndMacro
Macro DebugPrintN (DebugText)
PrintN(#PB_Compiler_Module+"::"+#PB_Compiler_Procedure+": "+DebugText)
EndMacro