Division to shift optimization

Got an idea for enhancing PureBasic? New command(s) you'd like to see?
Trond
Always Here
Always Here
Posts: 7446
Joined: Mon Sep 22, 2003 6:45 pm
Location: Norway

Division to shift optimization

Post by Trond »

When the dividend is a multiple of two it's possible to change the division into a shift. This gives a huge speed increase, and since it's already done for modulo it shouldn't be too much work to do it on division as well.

Code: Select all

a = a / 2
remi_meier
Enthusiast
Enthusiast
Posts: 468
Joined: Sat Dec 20, 2003 6:19 pm
Location: Switzerland

Post by remi_meier »

Code: Select all

a = -3
Debug a / 2
Debug a >> 1
Athlon64 3700+, 1024MB Ram, Radeon X1600
thefool
Always Here
Always Here
Posts: 5875
Joined: Sat Aug 30, 2003 5:58 pm
Location: Denmark

Post by thefool »

a = -3
Debug a / 2
Debug a >> 1
When the dividend is a multiple of two
remi_meier
Enthusiast
Enthusiast
Posts: 468
Joined: Sat Dec 20, 2003 6:19 pm
Location: Switzerland

Post by remi_meier »

Yeah, you're right, Trond means the Divisor.

Edit:
And actually, you don't need to be afraid of me changing the meaning of
my post after you posted. I would not do something like that to make you
look wrong.
Last edited by remi_meier on Sat Dec 01, 2007 3:46 pm, edited 1 time in total.
Athlon64 3700+, 1024MB Ram, Radeon X1600
User avatar
netmaestro
PureBasic Bullfrog
PureBasic Bullfrog
Posts: 8451
Joined: Wed Jul 06, 2005 5:42 am
Location: Fort Nelson, BC, Canada

Post by netmaestro »

Ok, is there a significant speed gain given that you have to go an extra step to see if it's a multiple of two and then branch to a shift? A valid comparison needs to use the full process, not just compare the speed of a shift vs. the speed of a division. Did you do some tests?
BERESHEIT
remi_meier
Enthusiast
Enthusiast
Posts: 468
Joined: Sat Dec 20, 2003 6:19 pm
Location: Switzerland

Post by remi_meier »

Actually, PB used to optimize it IIRC. It was removed because of the issue
in the second post.
Unless I'm totally misunderstanding something.

Trond wanted the optimization by the compiler that /2 gets changed to >>1
and 4 gets changed to >>2, etc. Like one does with *2 -> << 1, *4 -> <<2...
And the Modulo thing is actually a bit different. %2 -> & 1, %4 -> &3, ...
Athlon64 3700+, 1024MB Ram, Radeon X1600
milan1612
Addict
Addict
Posts: 894
Joined: Thu Apr 05, 2007 12:15 am
Location: Nuremberg, Germany
Contact:

Post by milan1612 »

Code: Select all

m = ElapsedMilliseconds()
For i = 0 To 100000000
  tmp = Random(500)
  x = tmp / 2
Next
t = ElapsedMilliseconds() - m

m = ElapsedMilliseconds()
For i = 0 To 100000000
  tmp = Random(500)
  If tmp % 2 = 0
    x = tmp >> 1
  Else
    x = tmp / 2
  EndIf
Next
t2 = ElapsedMilliseconds() - m

MessageRequester("Normal Division by 2", Str(t))
MessageRequester("Shift Replacement if modulo 2", Str(t2))
Without debugger the second one is still faster...
Windows 7 & PureBasic 4.4
thefool
Always Here
Always Here
Posts: 5875
Joined: Sat Aug 30, 2003 5:58 pm
Location: Denmark

Post by thefool »

On my computer, the second one is definently not faster. In fact its 40% slower. Also this should be done with an equal set of numbers as there is a posibility you get more divisable numbers on the second round.
And actually, you don't need to be afraid of me changing the meaning of
my post after you posted. I would not do something like that to make you
look wrong.
I was actually just changing the size of the letters, but then decided to add the original post for readability :)
I know it would be stupid to get what i meant wrong but you know..

And ok i admit it, a tiny part of the reason was to protect myself against edited posts :oops:
milan1612
Addict
Addict
Posts: 894
Joined: Thu Apr 05, 2007 12:15 am
Location: Nuremberg, Germany
Contact:

Post by milan1612 »

thefool wrote:On my computer, the second one is definently not faster. In fact its 40% slower
Did you run it without debugger?
I get 3568 normal division, 3172 with shift optimization (average values)
Windows 7 & PureBasic 4.4
thefool
Always Here
Always Here
Posts: 5875
Joined: Sat Aug 30, 2003 5:58 pm
Location: Denmark

Post by thefool »

milan1612 wrote:
thefool wrote:On my computer, the second one is definently not faster. In fact its 40% slower
Did you run it without debugger?
I get 3568 normal division, 3172 with shift optimization (average values)
NO I did not run it with the debugger on :P
This is my results:

---------------------------
Normal Division by 2
---------------------------
1373


---------------------------
Shift Replacement if modulo 2
---------------------------
1997
milan1612
Addict
Addict
Posts: 894
Joined: Thu Apr 05, 2007 12:15 am
Location: Nuremberg, Germany
Contact:

Post by milan1612 »

Must be my machine, damn slow crap :wink:
Any other results?
Windows 7 & PureBasic 4.4
remi_meier
Enthusiast
Enthusiast
Posts: 468
Joined: Sat Dec 20, 2003 6:19 pm
Location: Switzerland

Post by remi_meier »

ou wait, nobody here seems to know, what he's doing.
nothing against you, milan, but your speed-test is the worst I've seen in
a long time. at least 99% of the time in the loops is spent on the Random()
function. Nothing really to say about the >>1 thingy.
And I can swap the result:

Code: Select all

m = ElapsedMilliseconds() 
!align 4
For i = 0 To 100000000 
  tmp = Random(500) 
  x = tmp / 2 
Next 
t = ElapsedMilliseconds() - m 

m = ElapsedMilliseconds() 
!align 4
For i = 0 To 100000000 
  tmp = Random(500) 
  If tmp % 2 
    x = tmp >> 1 
  Else 
    x = tmp / 2 
  EndIf 
Next 
t2 = ElapsedMilliseconds() - m 

MessageRequester("", Str(t)) 
MessageRequester("", Str(t2))
Ok, last post by me in this thread. Everything was said after the first 2 posts.

@thefool: I know. I saw the previous size of the letters and was kinda offended.
But you know, it's a shitty day today anyway.

Don't mean to offend anyone, but today, it was just enough for me.
Athlon64 3700+, 1024MB Ram, Radeon X1600
thefool
Always Here
Always Here
Posts: 5875
Joined: Sat Aug 30, 2003 5:58 pm
Location: Denmark

Post by thefool »

@thefool: I know. I saw the previous size of the letters and was kinda offended.
:)

Well I'll just write a small test which use the same set of numbers and finds an average instead
milan1612
Addict
Addict
Posts: 894
Joined: Thu Apr 05, 2007 12:15 am
Location: Nuremberg, Germany
Contact:

Post by milan1612 »

remi_meier wrote:ou wait, nobody here seems to know, what he's doing.
nothing against you, milan, but your speed-test is the worst I've seen in
a long time. at least 99% of the time in the loops is spent on the Random()
function. Nothing really to say about the >>1 thingy.
Why? In both loops the Random() gets called the same times, so it's irrelevant...

EDIT:
Anyway, here's one without Random...

Code: Select all

#nums = 50000000

Dim numbers.l(#nums)
For i = 0 To #nums
  numbers(i) = Random(5000)
Next


m = ElapsedMilliseconds()
For i = 0 To #nums
  x = numbers(i) / 2
Next
t = ElapsedMilliseconds() - m

m = ElapsedMilliseconds()
For i = 0 To #nums
  If numbers(i) % 2 = 0
    x = numbers(i) >> 1
  Else
    x = numbers(i) / 2
  EndIf
Next
t2 = ElapsedMilliseconds() - m

MessageRequester("Normal Division by 2", Str(t))
MessageRequester("Shift Replacement if modulo 2", Str(t2))
On my machine, the second one is still faster...
Windows 7 & PureBasic 4.4
thefool
Always Here
Always Here
Posts: 5875
Joined: Sat Aug 30, 2003 5:58 pm
Location: Denmark

Post by thefool »

;first generate a set of numbers
Dim seta(30000000)
For i=0 To 30000000
seta(i)=Random(500)
Next i

;allright.
For a=1 To 20
m=ElapsedMilliseconds()
For i=0 To 30000000

;here goes the code
tmp=seta(i)
x=tmp/2

Next i
t=t+(ElapsedMilliseconds()-m)
Next a
t=t/20


For a=1 To 20
m=ElapsedMilliseconds()
For i=0 To 30000000

;here goes the code
tmp = seta(i)
If tmp % 2
x = tmp >> 1
Else
x = tmp / 2
EndIf

Next i
t2=t2+(ElapsedMilliseconds()-m)
Next a
t2=t2/20




MessageRequester("divide with 2",Str(t))
MessageRequester("the other",Str(t2))
warning: takes a bit time to run.
Here i still get a faster average for the first function
(i know i would get the same result if i multiplied the big loop count with 20 but then it would need a larger array)


HOWEVER I am not sure if this is how trond wants to implement this.
Post Reply