Adding 0.1 + 0.2 in CBM float

nippur72 · Post by **nippur72** » Thu Aug 23, 2018 9:01 am

When talking about standard IEEE floating point, usually 0.1 + 0.2 not giving 0.3 is taken as an example of the inability to represent certain numbers in base 2.

But I tried that in a VIC-20 and to my surprise it gives the correct result: 0.3.

Now I am pretty sure that the BASIC interpreter uses a base 2 floating point math: it's the notorious CBM float format which has 7 bytes when performing the calculations and 5 bytes when the number is stored in memory.

So what is the reason of the correct result? Does anybody know?

wimoos · Post by **wimoos** » Fri Aug 24, 2018 2:39 am

Hello nippur72,

Have you checked the workings of the routine at $DC1B where rounding is applied to FAC#1. This routine is called during calculations such as additions. Zp $70 holds rounding information.

Regards,

Wim.

Mike · Post by **Mike** » Fri Aug 24, 2018 4:17 am

nippur72 wrote:So what is the reason of the correct result?

Pure luck.

Several things involved here:

- the conversion ASCII -> float (hopefully) results in a float representation (i.e. machine number), that is the nearest to the real number,
- the addition of the two machine numbers (hopefully) results in a number that matches the conversion result of the sum of the original real summands,
- the conversion float -> ASCII of the calculated sum (hopefully) matches the ASCII representation of the intended result.

Each of these steps can go wrong!

As a counter example to yours, check out this:

Even though PRINT 0.6+0.1 seems to print the correct result 0.7, what has been computed internally is different from 0.7 (as machine number) as is evident from the Boolean expression of the second PRINT statement. The "0.7" of 0.6+0.1 actually was rounded before output, so you don't see a difference there in the ASCII output.

Finally the third PRINT shows the difference between the two machine numbers that represent the sum of 0.6 and 0.1, and 0.7, respectively. It's less than one half of the last digit that would have been printed, and that is the reason both numbers have the same ASCII output.

orion70 · Post by **orion70** » Fri Aug 24, 2018 5:46 am

Wow! First post in (almost) three years! Welcome back to the friendly community Nippur

. Still fiddling with the VIC-20? Current project(s)?

Back OT, I didn't know possible errors were so macroscopic. Is there any example of commercial or amateur programs with bugs involving or deriving from this issue? Is this error only observed in CBM machines?

Mike · Post by **Mike** » Wed Aug 29, 2018 3:14 pm

orion70 wrote:I didn't know possible errors were so macroscopic.

Well, "macroscopic". You see a lot of - actually meaningless(!) - digits, where you'd normally suppose to just see a single digit, 0.

But that's only for display. The error itself really is microscopic, in the order of less than a billionth of the affected value.

Is there any example of commercial or amateur programs with bugs involving or deriving from this issue?

Some (possibly badly written) algorithms tend to blow up those errors so they become visible. Notable reasons include the so-called accumulation of rounding errors and loss of significance by catastrophic cancellation.

A good deal of programs just do arithmetics and are unlikely to be affected by those issues.

If you "do" more involved mathematical problems, then you are more likely to run into the issues of floating point. The whole field of Numerical analysis centers around how to do calculations right and in a robust way in the presence of rounding errors of a floating point system.

Is this error only observed in CBM machines?

No. This kind of errors in found on all computers that employ floating point for calculations. Even the use of BCD mode doesn't prevent this (there, fractions like 1/3 still can't be represented as exact machine number).

Stormcrow · Post by **Stormcrow** » Thu Aug 30, 2018 11:52 am

Mike wrote:A good deal of programs just do arithmetics and are unlikely to be affected by those issues.

Basically, unless you are writing a decimal calculator, you're simply going to want to use the imperfect calculation to perform some purpose other than displaying the sum on the screen in numerical format. 2.32830644E-10 is so close to 0 that any normal application that yields it is just going to round it to zero anyway when making use of it.

What I find especially amusing about this example is that on the Vic, the Commutative Property of Addition does not hold. (0.6 + 0.1) − 0.7 ≠ (0.1 + 0.6) - 0.7

orion70 · Post by **orion70** » Thu Aug 30, 2018 2:05 pm

Thanks for replies and explanation guys. I will sleep well tonight, no games are gonna crash for a floating point math error

nippur72 · Post by **nippur72** » Fri Aug 31, 2018 1:47 am

it still amazes me that float is the number type around which the BASIC is build on. I think that has its roots in FORTRAN ? I guess in the '70s people expected to use Basic as a sort of calculator rather than a videogame platform

And btw, it all started with a meme I created some time ago:

Mike · Post by **Mike** » Fri Aug 31, 2018 10:41 am

nippur72 wrote:it still amazes me that float is the number type around which the BASIC is build on. I think that has its roots in FORTRAN?

Given IEEE double would only be capable of representing 15 significant decimal digits [1] (it falls short by just 1 single extra bit in the mantissa to allow for 16 sig. digits), I'd be inclined to blame that result on the float->ASCII conversion of "node". No point in trying to print out any more digits than 15 - that result with those extra non-0 digits at the end (or ..9999x) could well be expected!

Anyhow, all those numbers involved (0.1, 0.2, ... 0.9 with the exception of 0.5) cannot be represented exactly in a base-2 float format, so the machine numbers being added are already approximate to begin with - even if they're the nearest representation of the real number. Whether the sum of these two numbers - again! - corresponds to the number that would be the conversion result of the real sum to float just depends on luck. If you're unlucky, both (machine number) summands have been rounded down during ASCII->float, have been possibly nearly just within -1/2 ULP below their real counterpart, and their rounded sum just results in a machine number with more than -1 ULP difference to the intended result. The errors have accumulated.

OTOH, numbers without decimals usually can be exactly represented as float numbers! There is no rounding error at all involved, when numbers like 1, 5, -751, 72564, ... are added, subtracted and multiplied in your preferred floating point system, as long as those numbers and the result are smaller than the mantissa size allows. With CBM float, this means 9 digits.

Also, floating point division is supposed to return an exact integer[2] when the denominator is integer and the numerator is an exact multiple of it. That means integer division (result: Q) and modulo (result: R) *work* when done like this: Q=INT(N/D):R=N-Q*D ... what does *not* work is this version - X=N/D:Q=INT(X):F=X-Q:R=F*D - there, rounding errors easily make R non-integer because of cancellation errors.

Try it out, with N=355 and D=113 ..., R=16 with the first (and correct) method, and R=16.0000001 with the second (and wrong) method.

Stormcrow wrote:What I find especially amusing about this example is that on the Vic, the Commutative Property of Addition does not hold.

There is a reason for this: you're actually adding two numbers of different precision, and in that case, the commutative property of addition simply cannot hold. The left operand comes from memory, and the right operand has just been 'freshly' converted from ASCII to float in the FAC, thus is slightly more precise (the FAC features an extra rounding byte). Multiplication is similarly affected.

However, if you force both operands into memory and then add or multiply them, the commutative property holds:

...

Finally, set aside those 'nifty' results with visible wrong digits at their very end resulting from accumulated rounding errors (like seen in PRINT 7^2, where the result 49.0000001 is simply what follows from the implementation of the exponentiation operator), there are indeed real defects in the BASIC float routines, which deserve a little more attention. More details can be found in the two threads "Fun with CBM arithmetics" and "Fun with CBM arithmetics, Part II" - together with a solution to the issue presented there.

Greetings,

Michael

[1] See my second post in this thread.

[2] i.e. numbers without fractional digits. I don't refer to the 16-bit integer variable type here.

nippur72 · Post by **nippur72** » Fri Aug 31, 2018 1:47 pm

we should drop base 2 altogether and embrace base 10. It's not crazy as it seems. It was theorized by Douglas Crockford who proposed that we should adopt one numeric type that solves all (most) problems. He named it DEC64.

- it's base 10, so numbers for "human" usage are always represented
- there is not much difference between integers and floats, and conversion from/to is easy
- fast to compute in x86 assembly, if widely adopted could be implemented in hardware

Mike · Post by **Mike** » Fri Aug 31, 2018 2:38 pm

I did a quick scan of that web page.

Sorry, but DEC64 easily looks like your computer science math professors most favorite self-invented float format.

It focusses on the ideal case, where numbers with a rather small number of decimal digits are encountered. Either integer numbers, or a proper decimal fraction. Unfortunately, if one needs to solve non-trivial mathematical problems, that number of digits usually doesn't remain bounded. A number like 1/3 still couldn't be represented exactly in DEC64. So when Mr. Crockford writes:

Normalization is not required, and is usually not desired.

... it is nonetheless required as soon as one encounters numbers with a non-terminating decimal fraction (periodic or non-periodic doesn't matter), because the "coefficient" then has to be used in full to retain the maximum precision! And with this you mostly end up with the standard mantissa/exponent notation. Back to step one.

...

Float arithmetic in base-10 generally has the peculiar feature, that numbers near an exponent change "see" a rather harsh change of resolution by the factor of 10. With 3 significant digits for example, the machine representable numbers near 10 list thus: 9.97, 9.98, 9.99, 10, 10.1, 10.2, 10.3 - you see the sudden difference of resolution to the left and right of 10? Quite some algorithms (minimum search, Newton-Raphson, numerical differentiation) can easily hiccup on this.

This resolution change is mostly mitigated in base-2 float arithmetic. Only float systems based on logarithmic representation (which are however fairly obscure) fare better in that regard.

IMO, use of base-10 float is mostly motivated by the "ease" of conversion between ASCII and (base-10) float. Otherwise, there's no other aspect, which would make base-10 superior - and the resolution issue I outlined above more speaks against base-10.

nippur72 · Post by **nippur72** » Fri Aug 31, 2018 3:54 pm

must consider the context where DEC64 was designed. Crockford is a JavaScript developer and his target was high level languages, making them as simple as possible. One numeric type for everything.

Instead today we have floats (double, float) and the "decimal" types when you have to deal with money or other amounts that are made for 10-counting-humans.

Having two different number types is a big burden for the junior developer. Most of the time he will just use normal floats until he get caught in the problem. He is simply not aware of it. It happened to me many years ago when I wrote a business app and the "total" in the invoice sheet didn't match for 0.01 ! Nights spent on debugging. My hack was to disseminate the code with rounding in all the places! It didn't work. Then I discovered the decimal type, quickly adopted it, and my app was working again.

Some time later I developed a scripted language for business apps which is still in use today, and know what? My only numeric type is decimal. Never regretted since then!

BUT the resolution around "10" that you mentioned is a good point versus DEC64. Never heard it discussed before.

Mike · Post by **Mike** » Fri Aug 31, 2018 4:06 pm

nippur72 wrote:It happened to me many years ago when I wrote a business app and the "total" in the invoice sheet didn't match for 0.01! Nights spent on debugging. My hack was to disseminate the code with rounding in all the places! It didn't work. Then I discovered the decimal type, quickly adopted it, and my app was working again.

That sheds a bit of light on your background.

Of course, financial and actuarial maths have the strict requirement, that base/format conversions shall not lead to additional rounding errors.

I suppose though, whatever could be achieved with a dedicated base-10 float type in that context, could also be done with scaled integer arithmetics, taking cents as unit. Just put a decimal point between "tens" and "hundreds" of your output, and there you go.

BUT the resolution around "10" that you mentioned is a good point versus DEC64. Never heard it discussed before.

That applies to base-10 floats in general though, and wouldn't be a specific issue of DEC64.

Mike · Post by **Mike** » Sat Sep 01, 2018 1:32 pm

orion70 wrote:Thanks for replies and explanation guys. I will sleep well tonight, no games are gonna crash for a floating point math error

A good occasion to point out, that the float routines of CBM BASIC are better than their reputation suggests:

This small program prints the factorial of 69.

Hint: try calculating this with the 'standard' method of multiplying: 69! = 1 * 2 * 3 * ... * 68 * 69 and watch an ?OVERFLOW error ending the calculation.

For comparison: Google's answer

Edit: moved wimoos' topic deviation regarding the bug-fixed BASIC ROM into the appropriate thread: (link)

Denial

Adding 0.1 + 0.2 in CBM float

Adding 0.1 + 0.2 in CBM float

Re: Adding 0.1 + 0.2 in CBM float

Re: Adding 0.1 + 0.2 in CBM float

Re: Adding 0.1 + 0.2 in CBM float

Re: Adding 0.1 + 0.2 in CBM float

Re: Adding 0.1 + 0.2 in CBM float

Re: Adding 0.1 + 0.2 in CBM float

Re: Adding 0.1 + 0.2 in CBM float

Re: Adding 0.1 + 0.2 in CBM float

Re: Adding 0.1 + 0.2 in CBM float

Re: Adding 0.1 + 0.2 in CBM float

Re: Adding 0.1 + 0.2 in CBM float

Re: Adding 0.1 + 0.2 in CBM float

Re: Adding 0.1 + 0.2 in CBM float