Automatically replace divisions with multiplications (in some cases)

For everything that's not in any way related to PureBasic. General chat etc...
User avatar
jacdelad
Addict
Addict
Posts: 2029
Joined: Wed Feb 03, 2021 12:46 pm
Location: Riesa

Automatically replace divisions with multiplications (in some cases)

Post by jacdelad »

Hi,
since multiplications should be done much faster than divisions, is it possible to let the compiler change the divisions into multiplications in cases where the divisor is a fixed value (and not zero)? Would there be any problems with that?
I found a lot of divisions in my Gerber module and want to change them into multiplications. Thinking of that, an autooptimization by the compiler seems a nice gimmick.
Good morning, that's a nice tnetennba!

PureBasic 6.21/Windows 11 x64/Ryzen 7900X/32GB RAM/3TB SSD
Synology DS1821+/DX517, 130.9TB+50.8TB+2TB SSD
User avatar
NicTheQuick
Addict
Addict
Posts: 1527
Joined: Sun Jun 22, 2003 7:43 pm
Location: Germany, Saarbrücken
Contact:

Re: Automatically replace divisions with multiplications (in some cases)

Post by NicTheQuick »

Usually the optimization process of the C compiler does that kind of stuff. It should do things like "x / 2" -> "x >> 1" if it is an integer. For floats I am not that sure. That has to be checked.
The english grammar is freeware, you can use it freely - But it's not Open Source, i.e. you can not change it or publish it in altered way.
User avatar
skywalk
Addict
Addict
Posts: 4241
Joined: Wed Dec 23, 2009 10:14 pm
Location: Boston, MA

Re: Automatically replace divisions with multiplications (in some cases)

Post by skywalk »

In the meantime, you can scan your code for simple edits and create constants holding known inverses.
The nice thing about standards is there are so many to choose from. ~ Andrew Tanenbaum
User avatar
jacdelad
Addict
Addict
Posts: 2029
Joined: Wed Feb 03, 2021 12:46 pm
Location: Riesa

Re: Automatically replace divisions with multiplications (in some cases)

Post by jacdelad »

@Nic: That's the fastest way for the simplest case.
@skywalk: Yes, that's what I'm doing right now. But the code often looks better with the division, when a division is performed (I mean readability within formulas).

I still prefer the ASM compiler, I should maybe switch to C.
Good morning, that's a nice tnetennba!

PureBasic 6.21/Windows 11 x64/Ryzen 7900X/32GB RAM/3TB SSD
Synology DS1821+/DX517, 130.9TB+50.8TB+2TB SSD
User avatar
HeX0R
Addict
Addict
Posts: 1217
Joined: Mon Sep 20, 2004 7:12 am
Location: Hell

Re: Automatically replace divisions with multiplications (in some cases)

Post by HeX0R »

I'm wondering about all those threads talking about speed optimization, is that really worth the hassle against code readability?
To be true, I'm coding in PB since more then 20 years, but the only speed problem I ever had was while handling huge growing strings.
Isn't the time gone, where we all had slow CPUs and had to fight about any ms?
Sure, when you are working on an AAA game, I agree, speed matters, but where are all those AAA games created with PB?
User avatar
jacdelad
Addict
Addict
Posts: 2029
Joined: Wed Feb 03, 2021 12:46 pm
Location: Riesa

Re: Automatically replace divisions with multiplications (in some cases)

Post by jacdelad »

My Gerber module performs up to 1.2 million divisions when drawing a Gerber object. Yes, more speed would be nice.
Good morning, that's a nice tnetennba!

PureBasic 6.21/Windows 11 x64/Ryzen 7900X/32GB RAM/3TB SSD
Synology DS1821+/DX517, 130.9TB+50.8TB+2TB SSD
Olli
Addict
Addict
Posts: 1256
Joined: Wed May 27, 2020 12:26 pm

Re: Automatically replace divisions with multiplications (in some cases)

Post by Olli »

jacdelad wrote: Fri Oct 06, 2023 2:52 pm @Nic: That's the fastest way for the simplest case.
@skywalk: Yes, that's what I'm doing right now. But the code often looks better with the division, when a division is performed (I mean readability within formulas).

I still prefer the ASM compiler, I should maybe switch to C.
Did you test C with optimization switched on ?
User avatar
jacdelad
Addict
Addict
Posts: 2029
Joined: Wed Feb 03, 2021 12:46 pm
Location: Riesa

Re: Automatically replace divisions with multiplications (in some cases)

Post by jacdelad »

I didn't test C at all, at least not for my Gerber module. It's not done and I wanted to make it work 100% in ASM first.

My question was more about if there's any reason not to do it (by hand) and if the performance gain would be great in ASM too.
Good morning, that's a nice tnetennba!

PureBasic 6.21/Windows 11 x64/Ryzen 7900X/32GB RAM/3TB SSD
Synology DS1821+/DX517, 130.9TB+50.8TB+2TB SSD
User avatar
yuki
Enthusiast
Enthusiast
Posts: 101
Joined: Sat Mar 31, 2018 9:09 pm

Re: Automatically replace divisions with multiplications (in some cases)

Post by yuki »

jacdelad wrote: Sat Oct 07, 2023 9:43 am My question was more about if there's any reason not to do it (by hand) and if the performance gain would be great in ASM too.
If it's floating-point division, most compilers will not optimise division into a multiply-by-reciprocal unless an exact reciprocal exists (e.g., the divisor is a power of two), because doing so is to sacrifice correctness for speed. Usually, this is restricted to optimiser flags (-funsafe-math-optimizations) that you should only activate with great caution. It's most often best to manually perform the faster-but-less-accurate math on a case-by-case basis where it's really needed (so you avoid the optimiser breaking unrelated operations).

If it's integer division (like in this recent thread), doing the multiply-by-reciprocal trick can be more harmful than beneficial. The int->float conversion and subsequent floating-point multiply hampers possible optimisations, so you end up much better off writing straightforward divisions, at least if using the C optimiser. Even more so if you're doing int->float->int round-tripping.

Without using Int(...) to truncate the resulting float, rounding will take place and render results different from normal integer division, which may or may not be tolerable. But by doing this you lose the performance edge of this trick.

I wrote a bit about using bitwise tricks to accurately speedup integer division in the other thread, but it's probably better to clarify and extend on performance and tradeoffs depending on use case, accuracy needs, and compiler backend.

Code: Select all

Assume v.q loops over range 1 ... 1 000 000 000, and we wish to sum the result of
dividing each v step by int 12345.

══════════ Float math: accuracy trade-offs ══════════════════════════════════════
Each approach is inaccurate with respect to normal integer division, but in the
case where we sum into an integer type result, we beat performance of the naïve
integer division at least.
═════════════════════════════════════════════════════════════════════════════════
| Approach                                  | ASM    | C baseline | C optimised |
|-------------------------------------------|--------|------------|-------------|
| r.d + v / 12345.0                         | 5525ms |     2482ms |       941ms | <-- Summing into float.
| r.d + v * #RECIPROCAL_12345               | 5507ms |     2485ms |       623ms |
| r.q + (tmp.q := v / 12345.0)              | 1337ms |     1804ms |      1137ms | <-- Summing into int.
| r.q + (tmp.q := v * #RECIPROCAL_12345)    | 1100ms |     1698ms |      1053ms |


══════════ Float math: accurate ═════════════════════════════════════════════════
By applying Int(...) we achieve an accurate result, although the performance-edge
over straightforward integer division vanishes.
═════════════════════════════════════════════════════════════════════════════════
| Approach                                  | ASM    | C baseline | C optimised |
|-------------------------------------------|--------|------------|-------------|
| r.q + Int(v / 12345.0)                    | 1790ms |     1598ms |      1218ms |
| r.q + Int(v * #RECIPROCAL_12345)          | 1626ms |     1495ms |      1189ms |


══════════ Integer math: tricks and help from the optimiser ═════════════════════
By using clever bit-shifting we can cut normal integer division time in half, all
while remaining accurate. 
Notice how the C optimiser results in identical timing across straightforward and
clever bit-shifting approaches. That's because the optimiser will generate a very
similar means of quickly dividing where possible.
We can go farther with SIMD instruction sets like SSE4, AVX2, etc., and these can
greatly accelerate things, but because they aren't supported by all CPUs (and may
require unsafe optimisation levels) they are not automatically leveraged when the
PB C-backend's optimisation setting is enabled.
═════════════════════════════════════════════════════════════════════════════════
| Approach                                  | ASM    | C baseline | C optimised |
|-------------------------------------------|--------|------------|-------------|
| r.q + v / 12345                           | 1453ms |      700ms |       415ms |
| r.q + (v + (v * -1444876401) >> 32) >> 13 |  731ms |      652ms |       415ms |
| AVX2 i32 × 8                              |  111ms |        N/A |       134ms | <-- ASM = by hand, C = auto
                                                                                      we can nearly half this
                                                                                      by hyper-optimising for
                                                                                      an extremely niche case.


══════════ What if the divisor is unknown? ══════════════════════════════════════
In case the divisor isn't just a fixed 12345 but instead some unknown value, then
it can be faster to use the float reciprocal tricks, if repeatedly "dividing".

If you have tons of repeated division by a set value determined at runtime (while
being totally unknowable at compile-time), and it's *really* a hot path you need
to optimise, then libdivide or similar tricks will come out farthest ahead.

Cases like this exist, but are very rare, and maintainability is the price to be
paid for squeezing more performance.
(Ryzen 5800X, Windows 11 x64, PB 6.03b10 x64)

Main takeaway being that the C optimiser is clever enough to turn a straightforward division by constant into optimal bitwise trickery. By using tricks with floats, we can beat performance of unoptimised builds, but these become harmful and fall way behind when optimisation is on the table.

If you need the ultimate level of performance, the tricks required to beat an optimising C compiler will often require clever handwritten SIMD code (which is a whole different bag of trickery and footguns).

Like @Hex0R says as well, using trickery hurts maintainability as well, which is definitely an important consideration.
User avatar
jacdelad
Addict
Addict
Posts: 2029
Joined: Wed Feb 03, 2021 12:46 pm
Location: Riesa

Re: Automatically replace divisions with multiplications (in some cases)

Post by jacdelad »

Yuki, who the hell are you? Your answers in so many threads recently are helpful to many of us. Though you didn't post much until recently and I thought you were a newbie. Are you one of the coders who just pull out the correct and understandable answers out of their sleeves? :mrgreen:
Anyway, I much appreciate your answer. In my case, sacrificing a bit accuracy shouldn't matter at 6 or more digits behind the comma. I'll check my code for divisions and will replace them where it makes sense.
Good morning, that's a nice tnetennba!

PureBasic 6.21/Windows 11 x64/Ryzen 7900X/32GB RAM/3TB SSD
Synology DS1821+/DX517, 130.9TB+50.8TB+2TB SSD
User avatar
yuki
Enthusiast
Enthusiast
Posts: 101
Joined: Sat Mar 31, 2018 9:09 pm

Re: Automatically replace divisions with multiplications (in some cases)

Post by yuki »

jacdelad wrote: Sat Oct 07, 2023 5:06 pm Yuki, who the hell are you? Your answers in so many threads recently are helpful to many of us. Though you didn't post much until recently and I thought you were a newbie. Are you one of the coders who just pull out the correct and understandable answers out of their sleeves? :mrgreen:
Anyway, I much appreciate your answer. In my case, sacrificing a bit accuracy shouldn't matter at 6 or more digits behind the comma. I'll check my code for divisions and will replace them where it makes sense.
Ahahah, they're on to me :shock: !

But really, pretty long-time lurker and PB user (~14yrs). After extracting so much joy and knowledge from the forums over the years, I figure it's time for me to give some back :mrgreen: (though, vacation ending soon so might be back to lurker for a bit!)

And yeah, in your case it sounds like the slight accuracy tradeoff would be well worth the performance gained!
User avatar
idle
Always Here
Always Here
Posts: 6023
Joined: Fri Sep 21, 2007 5:52 am
Location: New Zealand

Re: Automatically replace divisions with multiplications (in some cases)

Post by idle »

yuki wrote: Sat Oct 07, 2023 5:27 pm
jacdelad wrote: Sat Oct 07, 2023 5:06 pm Yuki, who the hell are you? Your answers in so many threads recently are helpful to many of us. Though you didn't post much until recently and I thought you were a newbie. Are you one of the coders who just pull out the correct and understandable answers out of their sleeves? :mrgreen:
Anyway, I much appreciate your answer. In my case, sacrificing a bit accuracy shouldn't matter at 6 or more digits behind the comma. I'll check my code for divisions and will replace them where it makes sense.
Ahahah, they're on to me :shock: !

But really, pretty long-time lurker and PB user (~14yrs). After extracting so much joy and knowledge from the forums over the years, I figure it's time for me to give some back :mrgreen: (though, vacation ending soon so might be back to lurker for a bit!)

And yeah, in your case it sounds like the slight accuracy tradeoff would be well worth the performance gained!
Stay on holiday, we like your insights. :mrgreen:
Post Reply