CBM floating point data type

nippur72 · Post by **nippur72** » Wed Jun 18, 2008 12:56 pm

a question for the math wizards (Mike are you reading?)...

What's the best way to encode a number in the CBM floating point format ? I need to do that in my cross-compiler tool, so I need a sort of algorythm to convert a C++ double or float into the 5 bytes of the CBM format.

After investigation I've found that VIC-20 encodes numbers as powers of two. The first byte is the exponent, then a bit sign follows, and then mantissa, for a total of 5 bytes. The formula should be something like this (but I don't swear on it):

number = (-1^S) * 2^(E-127) + (1 + M / (2^33))

So where to start for a conversion routine?

BTW, the $DDD7: Print (FAC) as ASCII string. routine listed in the thread ROM calls and other tricks prints without sign.

Mike · Post by **Mike** » Thu Jun 19, 2008 12:54 am

There is:

$D7B5: Convert string starting at (INDEX1) of length (A) to FLPT value in (FAC).

with:

INDEX1: $0022-$0023, First utility pointer,
A: Accumulator,
FAC: $0061-$0066, Floating-point Accumulator (FAC).

Greetings,

Michael

BTW: (-1^S) will always give -1 as result. The power operation binds more tightly than sign inversion. You surely meant (-1)^S

.

nippur72 · Post by **nippur72** » Thu Jun 19, 2008 1:08 am

Mike wrote:$D7B5: Convert string starting at (INDEX1) of length (A) to FLPT value in (FAC).

yep, but I need something outside the VIC-20: I would add it to my cross compiler tool (C++ written) as I'm extending it to support floating point operations.

BTW: is it more convenient to encode starting from a string (e.g. processing each digit with adds and *10 muls) or from C++ IEEE float format? Thank you for any help, I would like to translate my 3d function plot type-in into assembler.

Mike wrote: BTW: (-1^S) will always give -1 as result. The power operation binds more tightly than sign inversion. You surely meant (-1)^S .

Mike · Post by **Mike** » Thu Jun 19, 2008 6:33 am

Code: Select all

/*
 *  IEEE double precision to CBM 5 byte float conversion
 *
 *  written 2008-06-19 by Michael Kircher
 */

#include <stdio.h>
#include <math.h>

int main()
{
  double number=3.141592653589793;
  double mantissa;
  int exponent;

  unsigned long cbm_mantissa;
  unsigned char cbm_exponent;

  mantissa=frexp(number,&exponent);

  cbm_mantissa=(unsigned long)(4294967296.0*fabs(mantissa)) & 0x7FFFFFFF + 2147483648*(mantissa<0);
  cbm_exponent=128+exponent;

  printf("IEEE double precision value = %.15g\n",number);
  printf("CBM 5-byte format: $%.2X $%.2X $%.2X $%.2X $%.2X\n",
                          cbm_exponent,
         (unsigned char) (cbm_mantissa >> 24),
         (unsigned char) (cbm_mantissa >> 16),
         (unsigned char) (cbm_mantissa >> 8),
         (unsigned char) (cbm_mantissa));

  return(0);
}

nippur72 · Post by **nippur72** » Thu Jun 19, 2008 7:39 am

WOW

Code: Select all

/*
 *  written 2008-06-19 by Michael Kircher
 */

double WOW

and a big thank you! Now I can add support to floating point to my compiler tool! Can't wait to test it!

Mike · Post by **Mike** » Fri Jun 20, 2008 2:27 am

I should add, that the number 0 needs extra treatment. It is all bytes zero in CBM float. Furthermore you should check the resulting exponent range for overflow.

Infinities, subnormals, and NaNs cannot be represented as CBM floats.

Michael

nippur72 · Post by **nippur72** » Fri Jun 20, 2008 3:51 am

Mike wrote:I should add, that the number 0 needs extra treatment.

I was about to tell you of that

I'm also getting strange problems with the kernal routines, I've to learn how to use them properly. I'll report more later.

nippur72 · Post by **nippur72** » Fri Jun 20, 2008 4:13 am

Here is a small program taken from VIC20 user guide that calculates the first digits of PI. I do the same calculation first in basic and then in machine language using the floating point kernel routines and thanks to the support of the CBM float data type.

Results of 2500 iterations:

BASIC: 00:00:31 seconds
ML: 00:00:17 seconds

Here is the code. Notice the "float" keyword among all other language extensions.

Code: Select all

  processor 6502
  
  include "floatlib.lm"
  include "vic20.lm"
  include "macros.lm"

  org $1001    
  basic start compact
     10 TI$="000000"
     20 PI=0
     30 FORC=1TO10001STEP4
     35 PI=PI+4/C-4/(C+2)      
     50 NEXT
     55 PRINT PI
     60 PRINT TI$
     70 TI$="000000"
     80 SYS {main}
     90 PRINT:PRINT TI$	       
  basic end
  
C  float 0.0    ; variables
PI float 0.0    ;

ZERO  float 0.0       ; float costants
ONE   float 1.0       ;
TWO   float 2.0       ;
FOUR  float 4.0       ;
MAX   float 10001.0   ;
  
main:
    mov fac, ZERO   ; PI = 0
	 mov PI, fac     ;
	 	     
	 mov fac, ONE    ; C = 1
	 mov C, fac      ; 
               
forstart:
    mov afac, FOUR     ; tempf1 = 4/C
	 ldx #$00		 	  ;	        
    div afac, C        ;      
	 mov tempf1, fac    ;
    
	 mov fac, C         ; tempf2 = C+2
    add fac, TWO       ;
    mov tempf2, fac    ;              
	   
    mov afac, FOUR     ;
	 ldx #$ff           ;
    div afac, tempf2   ; fac = -4/(C+2) 
	              				                      
    add fac, tempf1    ; fac = fac + tempf1 + PI             		 		                     
    add fac, PI        ;      
	 mov PI, fac        ;		 
 		 
	 mov fac, C
	 fcmp MAX
	 
	 if a<>#00 then 	 
	    add fac, FOUR
	    mov C, fac		 		 	
	    jmp forstart
	 end if   
	 	 
	 mov fac, PI
	 print fac
	 
	 rts

Mike · Post by **Mike** » Fri Jun 27, 2008 4:36 am

nippur72 wrote:Results of 2500 iterations:

BASIC: 00:00:31 seconds
ML: 00:00:17 seconds

This is about the best you could expect. You've effectively eliminated the number conversions, variable lookup, and building the evaluation tree 'on-the-fly' - leaving the floating point division as dominating factor. And of course the Gregory-Leibniz formula calculating pi=4*arctan(1) as series expansion isn't the most efficient way to do this.

With the function plot example you posted in another thread, you've got SQR(), and SIN() inside, so the gain from 'compiling' it might even be less. Did you try it?

The built-in SQR() can be replaced by Heron's algorithm for a 4 times speed gain.

Michael

nippur72 · Post by **nippur72** » Fri Jun 27, 2008 7:36 am

Mike wrote:With the function plot example you posted in another thread, you've got SQR(), and SIN() inside, so the gain from 'compiling' it might even be less. Did you try it?

I tried to "compile" it, but it became too complex so I had to give up. It was an hell of "store fac here", "take afac there" and so on. I was very disappointed because I believed that with my macro language tool the conversion would have been a trivial thing, while in reality it revealed a very complex task.

Perhaps I now need an "expression compiler"...

P.S. yes the sum is really slow converging, after 2500 iterations only few digits are exact. And there's a typo in the reference guide, a "+" in place of an "-" (at least in the electronic version).

Mike · Post by **Mike** » Mon Jun 30, 2008 1:41 am

nippur72 wrote:Perhaps I now need an "expression compiler"...

You could wrap the relevant functions, so that they don't operate on FAC, and AFAC, but instead on the top(most two) element(s) of a self-built number stack. And then use Reverse Polish Notation, as in FORTH.

Michael

nippur72 · Post by **nippur72** » Mon Jun 30, 2008 3:49 am

Mike wrote:You could wrap the relevant functions, so that they don't operate on FAC, and AFAC, but instead on the top(most two) element(s) of a self-built number stack.

that should be quite easy, considering that most (all?) of the functions take only one parameter and produce only one numeric result.

The real trouble is making the expression parser: I wrote three or four in the past (in different languages), but never managed to make them perfect (except the one that was made using Lex/Yacc).

Mike · Post by **Mike** » Mon Jun 30, 2008 4:24 am

nippur72 wrote:that should be quite easy, considering that most (all?) of the functions take only one parameter and produce only one numeric result.

Addition, subtraction, multiplication, and division can very well considered as functions taking two parameters.

The real trouble is making the expression parser: I wrote three or four in the past (in different languages), but never managed to make them perfect (except the one that was made using Lex/Yacc).

I know. But if you use RPN (instead of infix) the parser reduces to "get the next token", and "decide what to do with it". The expression tree really is built by the programmer. Of course one must be used to RPN ...

For example, line 35 in your program could be expressed thus:

Code: Select all

PI 4 C / + 4 C 2 + / -

leading to:

Code: Select all

PI      [PI]
4       [PI] [4]
C       [PI] [4] [C]
/       [PI] [4/C]
+       [PI+4/C]
4       [PI+4/C] [4]
C       [PI+4/C] [4] [C]
2       [PI+4/C] [4] [C] [2]
+       [PI+4/C] [4] [C+2]
/       [PI+4/C] [4/(C+2)]
-       [PI+4/C-4/(C+2)]

now assign top of stack to PI

Michael

nippur72 · Post by **nippur72** » Fri Jul 10, 2015 6:30 am

Resurrecting this old post, for an updated version of Mike's routine, translated to TypeScript/JavaScript:

Code: Select all

/*
 *  IEEE double precision to CBM 5 byte float conversion
 *
 *  written 2008-06-19 by Michael Kircher
 */

function CBMFloat(S: string): string {
   var num: double = Number(S);
   var mantissa: double;
   var exponent: int;

   var cbm_mantissa: ulong;
   var cbm_exponent: uchar;

   [mantissa, exponent] = frexp(num);

   cbm_mantissa = (4294967296.0 * fabs(mantissa)) & 0x7FFFFFFF + 2147483648 * ((mantissa < 0) ? 1 : 0);
   cbm_exponent = 128 + exponent;

   if (num == 0.0) {
      cbm_exponent = 0;
      cbm_mantissa = 0;
   }

   var v = [
      hex(cbm_exponent),
      hex(cbm_mantissa >> 24),
      hex(cbm_mantissa >> 16),
      hex(cbm_mantissa >> 8),
      hex(cbm_mantissa)
   ];

   return v.join(", ");
}

function frexp(value: double): [double, double] {
   var ret = { mantissa: 0, exponent: 0 };

   if (value == 0.0 || value == -0.0) {
      return [ret.mantissa, ret.exponent];
   }

   if (isNaN(value)) {
      ret.mantissa = Number.NaN;
      ret.exponent = -1;
      return [ret.mantissa, ret.exponent];
   }

   ret.mantissa = value;
   ret.exponent = 0;
   var sign = 1;

   if (ret.mantissa < 0.0) {
      sign=-1;
      ret.mantissa = -(ret.mantissa);
   }
   while (ret.mantissa < 0.5) {
      ret.mantissa *= 2.0;
      ret.exponent -= 1;
   }
   while (ret.mantissa >= 1.0) {
      ret.mantissa *= 0.5;
      ret.exponent++;
   }
   ret.mantissa *= sign;
   return [ret.mantissa, ret.exponent];
}

function hex(n: number): string
{
   n = n & 0xFF;
   var s = "0" + n.toString(16);
   s = s.substr(s.length - 2);
   s = s.toUpperCase();
   return "$" + s;      
}

wimoos · Post by **wimoos** » Wed Jul 15, 2015 4:44 am

BTW, the $DDD7: Print (FAC) as ASCII string. routine listed in the thread ROM calls and other tricks prints without sign.

You have to LDY #$01 before calling $DDD7, or, alternatively, you JSR $DDDD before calling $CB1E.

Denial

CBM floating point data type

CBM floating point data type

Re: CBM floating point data type

Re: CBM floating point data type