Page 1 of 1
[PB 6.21] Wrong precision on the integer part of floats
Posted: Mon Jul 07, 2025 12:47 am
by ColeopterusMaximus
I do understand that the decimal part of floats can be iffy but the integer part failing like this looks way too iffy to me.
Code: Select all
Define measure.f = 25.0
Define value.q = 60
Define result.f = 73713600.0
result + (measure * value)
Debug(result)
; Prints 73715104.0 (incorrect)
Code: Select all
Define measureb.f = 25.0
Define valueb.q = 60
Define resultb.q = 73713600
resultb + (measureb * valueb)
Debug(resultb)
; Prints 73715100 (correct)
Tested in windows and the behaviour is the same, no way this can be right.
Can somebody else please corroborate this?
Re: [PB 6.21] Wrong precision on the integer part of floats
Posted: Mon Jul 07, 2025 7:37 am
by STARGĂ…TE
No Bug.
Floats have only 24 significant bits, which means you can store integer numbers just until ~ 16 mio.
Your integer number needs more than 26 bits, so the last binary digits are lost during your calculation and the number (even the integer part) becomes inaccurate.
You can use Double, they can store 52 bits.
Edit: An example, at which point the inaccuracy starts:
Code: Select all
Define Float.f
Define Integer.i
Debug 1<<24
Debug "----"
For Integer = 16777210 To 16777230
Float = Integer
Debug Str(Integer) + " vs. " + StrF(Float)
Next
16777216
----
16777210 vs. 16777210
16777211 vs. 16777211
16777212 vs. 16777212
16777213 vs. 16777213
16777214 vs. 16777214
16777215 vs. 16777215
16777216 vs. 16777216
16777217 vs. 16777216
16777218 vs. 16777218
16777219 vs. 16777220
16777220 vs. 16777220
16777221 vs. 16777220
16777222 vs. 16777222
16777223 vs. 16777224
16777224 vs. 16777224
16777225 vs. 16777224
16777226 vs. 16777226
16777227 vs. 16777228
16777228 vs. 16777228
16777229 vs. 16777228
16777230 vs. 16777230
Re: [PB 6.21] Wrong precision on the integer part of floats
Posted: Mon Jul 07, 2025 10:19 am
by ColeopterusMaximus
No Bug.
Floats have only 24 significant bits, which means you can store integer numbers just until ~ 16 mio.
Your integer number needs more than 26 bits, so the last binary digits are lost during your calculation and the number (even the integer part) becomes inaccurate.
You can use Double, they can store 52 bits.
I can't thank you enough for explaining this, I wasn't aware of this at all, fortunately noticed this before I had put some code in production!
I'm so stupid, it is in the manual:
Code: Select all
Float | .f | 4 bytes | unlimited (see below)
Double | .d | 8 bytes | unlimited (see below)
I had assumed floats in 64 bits used 8 bytes like quads.