Automatically replace divisions with multiplications (in some cases)
Automatically replace divisions with multiplications (in some cases)
Hi,
since multiplications should be done much faster than divisions, is it possible to let the compiler change the divisions into multiplications in cases where the divisor is a fixed value (and not zero)? Would there be any problems with that?
I found a lot of divisions in my Gerber module and want to change them into multiplications. Thinking of that, an autooptimization by the compiler seems a nice gimmick.
since multiplications should be done much faster than divisions, is it possible to let the compiler change the divisions into multiplications in cases where the divisor is a fixed value (and not zero)? Would there be any problems with that?
I found a lot of divisions in my Gerber module and want to change them into multiplications. Thinking of that, an autooptimization by the compiler seems a nice gimmick.
Good morning, that's a nice tnetennba!
PureBasic 6.21/Windows 11 x64/Ryzen 7900X/32GB RAM/3TB SSD
Synology DS1821+/DX517, 130.9TB+50.8TB+2TB SSD
PureBasic 6.21/Windows 11 x64/Ryzen 7900X/32GB RAM/3TB SSD
Synology DS1821+/DX517, 130.9TB+50.8TB+2TB SSD
- NicTheQuick
- Addict

- Posts: 1527
- Joined: Sun Jun 22, 2003 7:43 pm
- Location: Germany, Saarbrücken
- Contact:
Re: Automatically replace divisions with multiplications (in some cases)
Usually the optimization process of the C compiler does that kind of stuff. It should do things like "x / 2" -> "x >> 1" if it is an integer. For floats I am not that sure. That has to be checked.
The english grammar is freeware, you can use it freely - But it's not Open Source, i.e. you can not change it or publish it in altered way.
Re: Automatically replace divisions with multiplications (in some cases)
In the meantime, you can scan your code for simple edits and create constants holding known inverses.
The nice thing about standards is there are so many to choose from. ~ Andrew Tanenbaum
Re: Automatically replace divisions with multiplications (in some cases)
@Nic: That's the fastest way for the simplest case.
@skywalk: Yes, that's what I'm doing right now. But the code often looks better with the division, when a division is performed (I mean readability within formulas).
I still prefer the ASM compiler, I should maybe switch to C.
@skywalk: Yes, that's what I'm doing right now. But the code often looks better with the division, when a division is performed (I mean readability within formulas).
I still prefer the ASM compiler, I should maybe switch to C.
Good morning, that's a nice tnetennba!
PureBasic 6.21/Windows 11 x64/Ryzen 7900X/32GB RAM/3TB SSD
Synology DS1821+/DX517, 130.9TB+50.8TB+2TB SSD
PureBasic 6.21/Windows 11 x64/Ryzen 7900X/32GB RAM/3TB SSD
Synology DS1821+/DX517, 130.9TB+50.8TB+2TB SSD
Re: Automatically replace divisions with multiplications (in some cases)
I'm wondering about all those threads talking about speed optimization, is that really worth the hassle against code readability?
To be true, I'm coding in PB since more then 20 years, but the only speed problem I ever had was while handling huge growing strings.
Isn't the time gone, where we all had slow CPUs and had to fight about any ms?
Sure, when you are working on an AAA game, I agree, speed matters, but where are all those AAA games created with PB?
To be true, I'm coding in PB since more then 20 years, but the only speed problem I ever had was while handling huge growing strings.
Isn't the time gone, where we all had slow CPUs and had to fight about any ms?
Sure, when you are working on an AAA game, I agree, speed matters, but where are all those AAA games created with PB?
{Home}.:|:.{Dialog Design0R}.:|:.{Codes}.:|:.{History Viewer Online}.:|:.{Send a Beer}
Re: Automatically replace divisions with multiplications (in some cases)
My Gerber module performs up to 1.2 million divisions when drawing a Gerber object. Yes, more speed would be nice.
Good morning, that's a nice tnetennba!
PureBasic 6.21/Windows 11 x64/Ryzen 7900X/32GB RAM/3TB SSD
Synology DS1821+/DX517, 130.9TB+50.8TB+2TB SSD
PureBasic 6.21/Windows 11 x64/Ryzen 7900X/32GB RAM/3TB SSD
Synology DS1821+/DX517, 130.9TB+50.8TB+2TB SSD
Re: Automatically replace divisions with multiplications (in some cases)
Did you test C with optimization switched on ?jacdelad wrote: Fri Oct 06, 2023 2:52 pm @Nic: That's the fastest way for the simplest case.
@skywalk: Yes, that's what I'm doing right now. But the code often looks better with the division, when a division is performed (I mean readability within formulas).
I still prefer the ASM compiler, I should maybe switch to C.
Re: Automatically replace divisions with multiplications (in some cases)
I didn't test C at all, at least not for my Gerber module. It's not done and I wanted to make it work 100% in ASM first.
My question was more about if there's any reason not to do it (by hand) and if the performance gain would be great in ASM too.
My question was more about if there's any reason not to do it (by hand) and if the performance gain would be great in ASM too.
Good morning, that's a nice tnetennba!
PureBasic 6.21/Windows 11 x64/Ryzen 7900X/32GB RAM/3TB SSD
Synology DS1821+/DX517, 130.9TB+50.8TB+2TB SSD
PureBasic 6.21/Windows 11 x64/Ryzen 7900X/32GB RAM/3TB SSD
Synology DS1821+/DX517, 130.9TB+50.8TB+2TB SSD
Re: Automatically replace divisions with multiplications (in some cases)
If it's floating-point division, most compilers will not optimise division into a multiply-by-reciprocal unless an exact reciprocal exists (e.g., the divisor is a power of two), because doing so is to sacrifice correctness for speed. Usually, this is restricted to optimiser flags (-funsafe-math-optimizations) that you should only activate with great caution. It's most often best to manually perform the faster-but-less-accurate math on a case-by-case basis where it's really needed (so you avoid the optimiser breaking unrelated operations).jacdelad wrote: Sat Oct 07, 2023 9:43 am My question was more about if there's any reason not to do it (by hand) and if the performance gain would be great in ASM too.
If it's integer division (like in this recent thread), doing the multiply-by-reciprocal trick can be more harmful than beneficial. The int->float conversion and subsequent floating-point multiply hampers possible optimisations, so you end up much better off writing straightforward divisions, at least if using the C optimiser. Even more so if you're doing int->float->int round-tripping.
Without using Int(...) to truncate the resulting float, rounding will take place and render results different from normal integer division, which may or may not be tolerable. But by doing this you lose the performance edge of this trick.
I wrote a bit about using bitwise tricks to accurately speedup integer division in the other thread, but it's probably better to clarify and extend on performance and tradeoffs depending on use case, accuracy needs, and compiler backend.
Code: Select all
Assume v.q loops over range 1 ... 1 000 000 000, and we wish to sum the result of
dividing each v step by int 12345.
══════════ Float math: accuracy trade-offs ══════════════════════════════════════
Each approach is inaccurate with respect to normal integer division, but in the
case where we sum into an integer type result, we beat performance of the naïve
integer division at least.
═════════════════════════════════════════════════════════════════════════════════
| Approach | ASM | C baseline | C optimised |
|-------------------------------------------|--------|------------|-------------|
| r.d + v / 12345.0 | 5525ms | 2482ms | 941ms | <-- Summing into float.
| r.d + v * #RECIPROCAL_12345 | 5507ms | 2485ms | 623ms |
| r.q + (tmp.q := v / 12345.0) | 1337ms | 1804ms | 1137ms | <-- Summing into int.
| r.q + (tmp.q := v * #RECIPROCAL_12345) | 1100ms | 1698ms | 1053ms |
══════════ Float math: accurate ═════════════════════════════════════════════════
By applying Int(...) we achieve an accurate result, although the performance-edge
over straightforward integer division vanishes.
═════════════════════════════════════════════════════════════════════════════════
| Approach | ASM | C baseline | C optimised |
|-------------------------------------------|--------|------------|-------------|
| r.q + Int(v / 12345.0) | 1790ms | 1598ms | 1218ms |
| r.q + Int(v * #RECIPROCAL_12345) | 1626ms | 1495ms | 1189ms |
══════════ Integer math: tricks and help from the optimiser ═════════════════════
By using clever bit-shifting we can cut normal integer division time in half, all
while remaining accurate.
Notice how the C optimiser results in identical timing across straightforward and
clever bit-shifting approaches. That's because the optimiser will generate a very
similar means of quickly dividing where possible.
We can go farther with SIMD instruction sets like SSE4, AVX2, etc., and these can
greatly accelerate things, but because they aren't supported by all CPUs (and may
require unsafe optimisation levels) they are not automatically leveraged when the
PB C-backend's optimisation setting is enabled.
═════════════════════════════════════════════════════════════════════════════════
| Approach | ASM | C baseline | C optimised |
|-------------------------------------------|--------|------------|-------------|
| r.q + v / 12345 | 1453ms | 700ms | 415ms |
| r.q + (v + (v * -1444876401) >> 32) >> 13 | 731ms | 652ms | 415ms |
| AVX2 i32 × 8 | 111ms | N/A | 134ms | <-- ASM = by hand, C = auto
we can nearly half this
by hyper-optimising for
an extremely niche case.
══════════ What if the divisor is unknown? ══════════════════════════════════════
In case the divisor isn't just a fixed 12345 but instead some unknown value, then
it can be faster to use the float reciprocal tricks, if repeatedly "dividing".
If you have tons of repeated division by a set value determined at runtime (while
being totally unknowable at compile-time), and it's *really* a hot path you need
to optimise, then libdivide or similar tricks will come out farthest ahead.
Cases like this exist, but are very rare, and maintainability is the price to be
paid for squeezing more performance.
Main takeaway being that the C optimiser is clever enough to turn a straightforward division by constant into optimal bitwise trickery. By using tricks with floats, we can beat performance of unoptimised builds, but these become harmful and fall way behind when optimisation is on the table.
If you need the ultimate level of performance, the tricks required to beat an optimising C compiler will often require clever handwritten SIMD code (which is a whole different bag of trickery and footguns).
Like @Hex0R says as well, using trickery hurts maintainability as well, which is definitely an important consideration.
Re: Automatically replace divisions with multiplications (in some cases)
Yuki, who the hell are you? Your answers in so many threads recently are helpful to many of us. Though you didn't post much until recently and I thought you were a newbie. Are you one of the coders who just pull out the correct and understandable answers out of their sleeves?
Anyway, I much appreciate your answer. In my case, sacrificing a bit accuracy shouldn't matter at 6 or more digits behind the comma. I'll check my code for divisions and will replace them where it makes sense.
Anyway, I much appreciate your answer. In my case, sacrificing a bit accuracy shouldn't matter at 6 or more digits behind the comma. I'll check my code for divisions and will replace them where it makes sense.
Good morning, that's a nice tnetennba!
PureBasic 6.21/Windows 11 x64/Ryzen 7900X/32GB RAM/3TB SSD
Synology DS1821+/DX517, 130.9TB+50.8TB+2TB SSD
PureBasic 6.21/Windows 11 x64/Ryzen 7900X/32GB RAM/3TB SSD
Synology DS1821+/DX517, 130.9TB+50.8TB+2TB SSD
Re: Automatically replace divisions with multiplications (in some cases)
Ahahah, they're on to mejacdelad wrote: Sat Oct 07, 2023 5:06 pm Yuki, who the hell are you? Your answers in so many threads recently are helpful to many of us. Though you didn't post much until recently and I thought you were a newbie. Are you one of the coders who just pull out the correct and understandable answers out of their sleeves?![]()
Anyway, I much appreciate your answer. In my case, sacrificing a bit accuracy shouldn't matter at 6 or more digits behind the comma. I'll check my code for divisions and will replace them where it makes sense.
But really, pretty long-time lurker and PB user (~14yrs). After extracting so much joy and knowledge from the forums over the years, I figure it's time for me to give some back
And yeah, in your case it sounds like the slight accuracy tradeoff would be well worth the performance gained!
Re: Automatically replace divisions with multiplications (in some cases)
Stay on holiday, we like your insights.yuki wrote: Sat Oct 07, 2023 5:27 pmAhahah, they're on to mejacdelad wrote: Sat Oct 07, 2023 5:06 pm Yuki, who the hell are you? Your answers in so many threads recently are helpful to many of us. Though you didn't post much until recently and I thought you were a newbie. Are you one of the coders who just pull out the correct and understandable answers out of their sleeves?![]()
Anyway, I much appreciate your answer. In my case, sacrificing a bit accuracy shouldn't matter at 6 or more digits behind the comma. I'll check my code for divisions and will replace them where it makes sense.!
But really, pretty long-time lurker and PB user (~14yrs). After extracting so much joy and knowledge from the forums over the years, I figure it's time for me to give some back(though, vacation ending soon so might be back to lurker for a bit!)
And yeah, in your case it sounds like the slight accuracy tradeoff would be well worth the performance gained!

