Assembly BLAS routines for C64 / VIC-20

Basic and Machine Language

Moderator: Moderators

Post Reply
User avatar
MrSterlingBS
Vic 20 Enthusiast
Posts: 174
Joined: Tue Jan 31, 2023 2:56 am
Location: Germany,Braunschweig

Assembly BLAS routines for C64 / VIC-20

Post by MrSterlingBS »

Hello,

i like to use the BLAS routines in my finite elements method.
The listings can be found here.
https://eprints.maths.manchester.ac.uk/2029/
Of course the code is for the C64 Basic, I'll do the translation into the VC20 basic later.

I think, hope, the code is correct typed in, but it shows an error.

Is anyone here willing to watch the program? I just can't find the error.

BR
Sven

Code: Select all

SGEFA AND SGESL WITH BASIC+ML

USES THE 4K extension for the ASSEMBLY CODE
$C000 TO $CFFF
; SAXPY = $C037 = 49207
; ISAMAX = $C072 = 49266
; SSCAL = $C0CC = 49356

5 isamax=49266:sscal=49356:saxpy=49207
10 printchr$(147)
20 print"read data..."
30 gosub1000
40 i%=0:j%=0:k%=0:n%=0:l%=0:kp1%=0:nm1%=0:info%=0:job%=0
50 t=0:s=0
80 input"n: ";n%
120 dim a(n%,n%),b(n%),x(n%),ipvt(n%)
160 for i=1 to n%:for j=1 to n%
162 a(i%,j%)=-1+2*rnd(1)
168 next:next
170 for i=1 to n%:x(i%)=-1+2*rnd(1):next
190 for i=1 to n%:s=0
200 for j=1 to n%:s=s+a(i%,j%)*x(j%):next
210 b(i%)=s:next
220 printchr$(147):print"n = ";n%:print
250 ti$="000000"
260 gosub450
270 print"sgefa:";ti/60;"sec"
290 job%=0:ti$="000000"
300 gosub700
310 print"sgesl:";ti/60;"sec"
340 s=0:for i%=1 to n%
350 s=s+abs(b(i%)-x(i%)):next
360 print:print"one-norm of error = ":prints
380 print"----------------------"
390 end
450 info%=.:nm1%=n%-1
460 if nm1% <1 then 670
480 for k%=1 to nm1%
490 kp1%=k%+1
500 q%=n%-k%+1:sys isamax,q%,a(k%,k%),l%:l%=l%+k%-1
530 ipvt(k%)=l%
540 if a(l%,k%)=0 then info%=k%:goto650
550 if l%<>k% then t=a(l%,k%):a(l%,k%)=a(k%,k%):a(k%,k%)=t
570 t=-1/a(k%,k%)
580 q%=n%-k%:sys sscal,q%,t,a(kp1%,k%)
600 for j%=kp1% to n%
610 t=a(l%,j%):if l%<>k% then a(l%,j%)=a(k%,j%):a(k%,j%)=t
620 q%=n%-k%:sys saxpy,q%,t,a(kp1%,k%),a(kp1%,j%)
650 next
670 ipvt%(n%)=n%
680 ifa(n%,n%)= 0 then info%=n%
690 return
700 nm1%=n%-1
710 if job%<>0 then 900
720 if nm1%<1 then 850
730 for k%=1 to nm1%
740 l%=ipvt%(k%):t=a(l%,0)
750 if l%<>k% then a(l%,0)=a(k%,0):a(k%,0)=t
760 q%=n%-k%:sys saxpy,q%,t,a(k%+1,k%),a(k%+1,0)
770 next
850 for k%=n% to 1 step-1
860 a(k%,0)=a(k%,0)/a(k%,k%):t=-a(k%,0)
870 q%=k%-1:sys saxpy,q%,t,a(1,k%),a(1,0)
880 next
900 return
1000 rem generated ml loader
1010 sa = 49152
1020 for n = 0 to 355
1030 read a% : poke sa+n,a%: next n
1050 return 
1060 data 32,250,192,32,58,193,32,8
1070 data 193,165,247,5,248,240,29,165
1080 data 249,164,250,32,162,187,32,88
1090 data 188,169,92,160,0,32,103,184
1100 data 32,199,187,32,22,193,32,91
1110 data 193,32,9,192,32,253,174,32
1120 data 139,176,170,32,212,187,96,32
1130 data 250,192,32,80,193,32,69,193
1140 data 32,58,193,165,247,5,248,240
1150 data 40,165,253,164,254,32,162,187
1160 data 165,251,164,252,32,40,186,165
1170 data 249,164,250,32,103,184,166,249
1180 data 164,250,32,212,187,32,34,193
1190 data 32,22,193,32,91,193,76,67
1200 data 192,96,32,250,192,32,58,193
1210 data 32,8,193,165,247,133,251,133
1220 data 253,165,248,133,252,133,254,230
1230 data 253,208,2,230,254,165,247,5
1240 data 248,240,41,165,249,164,250,32
1250 data 162,187,32,88,188,169,92,160
1260 data 0,32,91,188,48,13,240,11
1270 data 165,247,133,251,165,248,133,252
1280 data 32,199,187,76,22,193,32,91
1290 data 193,76,141,192,56,165,253,229
1300 data 251,168,165,254,229,252,32,145
1310 data 179,76,44,192,32,250,192,32
1320 data 69,193,32,58,193,165,247,5
1330 data 248,240,30,165,249,164,250,32
1340 data 162,187,165,251,164,252,32,40
1350 data 186,166,249,164,250,32,212,187
1360 data 32,22,193,32,91,193,76,213
1370 data 192,96,32,253,174,32,138,173
1380 data 32,247,183,132,247,133,248,96
1390 data 162,4,169,0,149,92,149,97
1400 data 202,16,249,133,102,96,24,165
1410 data 249,105,5,133,249,144,2,230
1420 data 250,96,24,165,251,105,5,133
1430 data 251,144,2,230,252,96,24,165
1440 data 253,105,5,133,253,144,2,230
1450 data 254,96,32,253,174,32,139,176
1460 data 133,249,132,250,96,32,253,174
1470 data 32,139,176,133,251,132,252,96
1480 data 32,253,174,32,139,176,133,253
1490 data 132,254,96,165,247,208,2,198
1500 data 248,198,247,96

Code: Select all

; ########################################################
; ASSEMBLY LANGUAGE BLAS ROUTINES FOR THE COMMODORE 64.
; TO BE CALLED FROM A C64 BASIC PROGRAM.
; LISTING OF SASUM, SAXPY, ISAMAX, SSCAL ONLY.
; ########################################################

; KERNAL ROUTINES FROM VIC-20 AND C64
; VIC-20        $C000 - $EA7A
; C64           $A000 - $BFFF
; $BBFC         C64
; $DBFC         VIC-20
;
; $AEFD         C64
; $CEFD         VIC-20

; NOTATION:
; FP1, FP2 = FLOATING POINT ACCUMULATORS 1, 2
; MEM.AY := '(A, Y)' =FL. PT. NUMBER AT ADDRESS A+256*Y
; MEM.XY := '(X, Y)' = FL.PT. NUMBER AT ADDRESS X+256*Y
;
; ADDRESSES OF (ROM) ROUTINES IN THE BASIC INTERPRETER:
;
EVAL    = $AD8A         ; GETS & EVALUATES NUMERIC EXPRESSION FROM TEXT,
                        ; RESULT PLACED IN FP1.
COMMA   = $AEFD         ; CHECK FOR COMMA
INTEGER = $B7F7         ; FP1 -> INTEGER AT (Y,A)
INTFLP  = $B391         ; FP1 := FLOAT((Y, A)) 

PTRGET  = $B08B         ; GETS NAME AND POINTER TO A VARIABLE, RETURNS
                        ; WITH (A,Y) POINTING TO EXPONENT (OF FIRST ELEMENT IN ARRAY),
                        ; FOR NUMERIC VARIABLE.

LOADFP1 = $BBA2         ; FP1 = MEM.AY
SAVEFP1 = $BBD4         ; MEM.XY = FP1

ADD     = $B86A         ; FP1 := FP1+FP2
MULT    = $BA2B         ; FP1 := FP1*FP2
ABS     = $BC58         ; FP1 := ABS(FP1)       = LSR $66
SQRT    = $BF71         ; FP1 := SQRT(FP1)

COMPARE = $BC5B         ; COMPARE FP1 WITH MEM.AY
; A=0 IF EQUAL, A=1 IF FP1 > MEM.AY, A=$FF if FP1 < MEM.AY

ADDMEM  = $B867         ; FP1 = FP1 + MEM.AY
MULTMEM = $BA28         ; FP1 = FP1 * MEM.AY

STORE   = $5C           ; FP3 : $5C-$60
FP1TOSTORE = $BBC7      ; FP3 := FP1

NLOW    = $F7           
NHIGH   = $F8

PLOW1   = $F9           ; POINTER (PTR1)
PHIGH1  = $FA
PLOW2   = $FB           ; POINTER (PTR2)              
PHIGH2  = $FC
PLOW3   = $FD           ; POINTER (PTR3)
PHIGH3  = $FE

FP1     = $61           ; FP1 : $61-$66
FP2     = $69           ; FP2 : $69-$6E

;--------------------------------------------------------------------------
        
* = $C000               ; ASSEMBLE CODE IN SPARE 4K BLOCK STARTING AT $C000

; -------------------------------------------------------------------------
; REAL FUNCTION SASUM (N,SX)
;
; SUM OF ABSOLUTE VALUES OF A VECTOR
;
; SYS ASUM,N,SX(),S
;


SASUM
                JSR GETN                        ; EVALUATE 1ST PARAMETER
                JSR GET1                        ; (PTR) -> SX()
                JSR ZEROSTORE                   ; SUM:=0
LOOPSA
                LDA NLOW                        ; N=0?
                ORA NHIGH                       
                BEQ FINSA                       ; IF SO, FINISHED
                
                LDA PLOW1                       ; SET UP ACCUM.
                LDY PHIGH1                      ; AND Y-REG.
                JSR LOADFP1                     ; THEN CALL ROM ROUTINE
                
                JSR ABS                         ; FP1 := ABS(FP1)

                LDA #<STORE
                LDY #>STORE
                JSR ADDMEM                      ; FP1 := FP1 + SUM

                JSR FP1TOSTORE                  ; SUM := FP1
                
                JSR BUMP1                       ; CALL SUBROUTINES AT
                JSR NEQNM1                      ; END OF LISTING.
                JSR LOOPSA                      ; LOOP BACK
                
FINSA
                JSR COMMA                       ; STORE RESULT (FP1) IN
                JSR PTRGET                      ; VARIABLE. THIS CODE CALLED BY
                TAX                                     ; ISAMAX ALSO.
                JSR SAVEFP1                     
                RTS
                
                
; -------------------------------------------
; SUBROUTINE SAXPY (N,SA,SX,SY)
;
; VECTOR=VECTOR+CONST*VECTOR: SY() := SY()+SA*SX()
;
; SYS AXPY,N,SA,SX(),SY()
;
                
                
SAXPY
        JSR GETN
        JSR GET3                        ; (PTR3) -> SA
        JSR GET2                        ; (PTR2) -> SX()
        JSR GET1                        ; (PTR1) -> SY()
                
LOOPSAX
        LDA NLOW                        ; N=0?
        ORA NHIGH
        BEQ FINSAX

        LDA PLOW3                       ; FP1:=SA
        LDY PHIGH3
        JSR LOADFP1

        LDA PLOW2                       ; FP1:=FP1*SX()
        LDY PHIGH2
        JSR MULTMEM

        LDA PLOW1                       ; FP1:=FP1+SY()
        LDY PHIGH1
        JSR ADDMEM
        
        LDX PLOW1                       ; SY():=FP1
        LDY PHIGH1
        JSR SAVEFP1

        JSR BUMP2                       ; MOVE PTRS TO NEXT
        JSR BUMP1                       ; ELTS OF SX & SY.

        JSR NEQNM1
        JMP LOOPSAX

FINSAX
        RTS

; -------------------------------------------
; INTEGER FUNCTION ISAMAX (N,SX)
; 
; FIND INDEX OF ELT WITH LARGEST ABSOLUTE VALUE IN VECTOR X
;
; SYS ISAMAX,N,SX(),K
;


ISAMAX
                JSR GETN
                JSR GET1                ; (PTR1) -> SX()
                JSR ZEROSTORE           ; CURRENT MAX := 0
                
                LDA NLOW
                STA PLOW2
                STA PLOW3
                LDA NHIGH                       ; PTR2 = N+1-INDEX OF
                STA PHIGH2                      ;                CURRENT MAX ELT.
                STA PHIGH3                      ; PTR3 = SAVED VALUE OF N
                INC PLOW3                       ;                PLUS 1.
                BNE LOOPMAX
                INC PHIGH3
LOOPMAX
                LDA NLOW                        ; N=0?
                ORA NHIGH
                BEQ FINMAX
                
                LDA PLOW1
                LDY PHIGH1
                JSR LOADFP1                     ; FP1 := SX()
                JSR ABS                                ; FP1 := ABS(FP1)
                
                
                LDA #<STORE
                LDY #>STORE
                JSR COMPARE                     ; FP1 & BIGGEST SO FAR
                BMI LTE                         ; IF FP1 < ...
                BEQ LTE                         ; IF FP1 = ...
                
                LDA NLOW                        ; FP1 is BIGGER
                STA PLOW2
                LDA NHIGH
                STA PHIGH2                      ; UPDATE INDEX OF MAX ELT
                
                JSR FP1TOSTORE          ; SAVE CURRENT MAX ELT
LTE
                JMP BUMP1
                JSR NEQNM1
                JMP LOOPMAX
FINMAX
                SEC                                     ; INDEX := N+1-PTR2
                LDA PLOW3
                SBC PLOW2
                TAY
                LDA PHIGH3
                SBC PHIGH2
                
                JSR INTFLP                      ; CONVERT RESULT TO
                JMP FINSA                       ; FL-PT. % GOTO SASUM

; -------------------------------------------
; SUBROUTINE SSCAL (N,SA,SX)
;
; SCALE VECTOR BY A CONSTANT: SX():= SA*SX()
;
; SYS SCAL,N,SA,SX()
;
                
                
SSCAL
                JSR GETN
                JSR GET2                        ; (PTR2) -> SA
                JSR GET1                        ; (PTR1) -> SX()
                
LOOPSC
                LDA NLOW                        ; N=0?
                ORA NHIGH
                BEQ FINSC
                
                LDA PLOW1
                LDY PHIGH1
                JSR LOADFP1                     ; FP1 := SX()
                
                LDA PLOW2
                LDY PHIGH2
                JSR MULTMEM                     ; FP1 := FP1*SA
                
                LDX PLOW1
                LDY PHIGH1
                JSR SAVEFP1                     ; SX() := FP1
                
                JSR BUMP1
                JSR NEQNM1
                JMP LOOPSC
                
FINSC
                RTS
                

                
; -------------------------------------------
; ROUTINE TO EVALUATE THE PARAMETER N AND STORE THE RESULT
; AS A 16-BIT INTEGER IN (NLOW, NHIGH).
;
                
GETN
        JSR COMMA
        JSR EVAL
        JSR INTEGER
        STY NLOW
        STA NHIGH
        RTS

; -------------------------------------------
; SET TO ZERO FP3 AND FP1, THE LATTER SO THAT SASUM, SDOT AND
; SNRM2 RETURN 0 WHEN N=0.
;

ZEROSTORE
        LDX #4                          ; 5 ELTS TO ZEROSTORE
        LDA #0

LOOPZ1
        STA STORE,X
        STA FP1,X
        DEX
        BPL LOOPZ1                      ; BRANCH IF X>=0
        STA FP1+5
        RTS

; -------------------------------------------
; THE FOLLOWING ROUTINES MOVE A POINTER ONTO THE NEXT ARRAY ELEMENT
;

BUMP1
        CLC                                     ; BUMP PTR1 BY 5
        LDA PLOW1
        ADC #5
        STA PLOW1
        BCC FIN1
        INC PHIGH1
FIN1   
                RTS

BUMP2
        CLC                                     ; BUMP PTR2 BY 5
        LDA PLOW2
        ADC #5
        STA PLOW2
        BCC FIN2
        INC PHIGH2
FIN2   
                RTS

BUMP3
        CLC                                     ; BUMP PTR3 BY 5
        LDA PLOW3
        ADC #5
        STA PLOW3
        BCC FIN3
        INC PHIGH3
FIN3   
                RTS

; -------------------------------------------
; THE FOLLOWING ROUTINES SEARCH FOR A
; NUMERIC VARIABLE (SIMPLE VAR, OF ARRAY ELEMENT) AND
; STORE A POINTER TO THE FIRST BYTE OF THE
; FLOATING POINT NUMBER IN (PTR1), (PTR2) OR (PTR3).
;

GET1
        JSR COMMA
        JSR PTRGET
        STA PLOW1
        STY PHIGH1
        RTS

GET2
        JSR COMMA
        JSR PTRGET
        STA PLOW2
        STY PHIGH2
        RTS

GET3
        JSR COMMA
        JSR PTRGET
        STA PLOW3
        STY PHIGH3
        RTS

; -------------------------------------------
        
NEQNM1
        LDA NLOW                        ; N := N-1
        BNE NM1
        DEC NHIGH
NM1     
                DEC NLOW        
        RTS

; -------------------------------------------
User avatar
Mike
Herr VC
Posts: 4841
Joined: Wed Dec 01, 2004 1:57 pm
Location: Munich, Germany
Occupation: electrical engineer

Re: Assembly BLAS routines for C64 / VIC-20

Post by Mike »

You should perhaps enlighten the people here about the purpose of those routines.
User avatar
chysn
Vic 20 Scientist
Posts: 1205
Joined: Tue Oct 22, 2019 12:36 pm
Website: http://www.beigemaze.com
Location: Michigan, USA
Occupation: Software Dev Manager

Re: Assembly BLAS routines for C64 / VIC-20

Post by chysn »

Mike wrote: Wed Sep 20, 2023 11:39 am You should perhaps enlighten the people here about the purpose of those routines.
I read the code, the website, and the associated PDF, and I still know exactly nothing about the project. I used to read the daily article about playing bridge hands in my local newspaper to get the same feeling.
VIC-20 Projects: wAx Assembler, TRBo: Turtle RescueBot, Helix Colony, Sub Med, Trolley Problem, Dungeon of Dance, ZEPTOPOLIS, MIDI KERNAL, The Archivist, Ed for Prophet-5

WIP: MIDIcast BASIC extension

he/him/his
User avatar
srowe
Vic 20 Scientist
Posts: 1340
Joined: Mon Jun 16, 2014 3:19 pm

Re: Assembly BLAS routines for C64 / VIC-20

Post by srowe »

User avatar
MrSterlingBS
Vic 20 Enthusiast
Posts: 174
Joined: Tue Jan 31, 2023 2:56 am
Location: Germany,Braunschweig

Re: Assembly BLAS routines for C64 / VIC-20

Post by MrSterlingBS »

Please excuse my problem not being described in detail. Tomorrow I will describe my plan better.
User avatar
MrSterlingBS
Vic 20 Enthusiast
Posts: 174
Joined: Tue Jan 31, 2023 2:56 am
Location: Germany,Braunschweig

Re: Assembly BLAS routines for C64 / VIC-20

Post by MrSterlingBS »

Hello,

one question!
why doesn't the routine FINSA return the calculated variable?

Best regards
Sven
User avatar
MrSterlingBS
Vic 20 Enthusiast
Posts: 174
Joined: Tue Jan 31, 2023 2:56 am
Location: Germany,Braunschweig

Re: Assembly BLAS routines for C64 / VIC-20

Post by MrSterlingBS »

My intention with the code is to integrate the Gauss algorithm into my FEM calculation. Equations can be solved using the Gauss method. The solution can be used, for example, to examine a 3D frame for strength. I think it's a very interesting application.

At the moment my basic program takes 8 minutes with 18 nodes. Using the Gauss method in Assembler the time would be shortened to about 2 minutes.

I hope this help…
Attachments
IMG_5885.gif
IMG_5885.gif (13.43 KiB) Viewed 2012 times
Frame
Frame
IMG_5884.png (8 KiB) Viewed 2012 times
Post Reply