How to use floating point routines?
Moderator: Moderators
- MrSterlingBS
- Vic 20 Enthusiast
- Posts: 174
- Joined: Tue Jan 31, 2023 2:56 am
- Location: Germany,Braunschweig
How to use floating point routines?
Hello,
in BASIC we write PRINT SQR(3) and get the result 1.73205081.
how can i print the sqr to screen with the ROM calls?
BR
Sven
Have Fun!
in BASIC we write PRINT SQR(3) and get the result 1.73205081.
how can i print the sqr to screen with the ROM calls?
BR
Sven
Have Fun!
- MrSterlingBS
- Vic 20 Enthusiast
- Posts: 174
- Joined: Tue Jan 31, 2023 2:56 am
- Location: Germany,Braunschweig
Re: How to use floating point routines?
Sorry for the post,
i found the answer.
BR SVEN
i found the answer.
Code: Select all
ldy #$03
jsr $d3a2
jsr $df71
jmp $ddd7
- MrSterlingBS
- Vic 20 Enthusiast
- Posts: 174
- Joined: Tue Jan 31, 2023 2:56 am
- Location: Germany,Braunschweig
Re: How to use floating point routines?
Hello at all,
how is it possible to store some vaules in FAC ($61 to $66) and print them out with $ddd7?
i load some values in this area but I can't interpret the result.
BR Sven
VIC-20 (6502) coding is fun!
how is it possible to store some vaules in FAC ($61 to $66) and print them out with $ddd7?
i load some values in this area but I can't interpret the result.
BR Sven
VIC-20 (6502) coding is fun!
- chysn
- Vic 20 Scientist
- Posts: 1205
- Joined: Tue Oct 22, 2019 12:36 pm
- Website: http://www.beigemaze.com
- Location: Michigan, USA
- Occupation: Software Dev Manager
Re: How to use floating point routines?
It sounds like you might benefit from this thread (http://sleepingelephant.com/ipw-web/bul ... p?p=107429). There's a link to a book there that has really useful information about all kinds of FAC/integer/string conversions. Basically, to store a string value from the utility pointer into a FAC, you leverage part of VAL(). I have no idea what the FAC format is, but I don't need to know because BASIC has all the conversion stuff I'd need.MrSterlingBS wrote: ↑Thu Mar 02, 2023 4:50 am how is it possible to store some vaules in FAC ($61 to $66) and print them out with $ddd7?
VIC-20 Projects: wAx Assembler, TRBo: Turtle RescueBot, Helix Colony, Sub Med, Trolley Problem, Dungeon of Dance, ZEPTOPOLIS, MIDI KERNAL, The Archivist, Ed for Prophet-5
WIP: MIDIcast BASIC extension
he/him/his
WIP: MIDIcast BASIC extension
he/him/his
- MrSterlingBS
- Vic 20 Enthusiast
- Posts: 174
- Joined: Tue Jan 31, 2023 2:56 am
- Location: Germany,Braunschweig
Re: How to use floating point routines?
Thank you for the book tip. Exactly what I was missing.
BR Sven
BR Sven
- Mike
- Herr VC
- Posts: 4841
- Joined: Wed Dec 01, 2004 1:57 pm
- Location: Munich, Germany
- Occupation: electrical engineer
Re: How to use floating point routines?
Also @MrSterlingBS: The FAC contains an extended version of the memory storage format for float numbers:chysn wrote:I have no idea what the FAC format is, [...]
- The memory storage format uses one exponent byte and four mantissa bytes. The upper most bit of the mantissa holds the sign.
- The FAC format contains an extra rounding byte, an extra sign byte, and the upper most bit exposes the most significant digit of the mantissa, which in base-2 notation is always 1 (unless there's a zero in the FAC).
One way to 'construct' a specific float constant in memory storage format is to assign a value to a variable and then inspect BASIC memory with a monitor, like thus:
There is no BASIC program in memory, and the variable XY is created directly after the program end marker (the two 0 bytes in $1001 and $1002): each variable entry has a length of 7 bytes, $58 $59 represent the name of the variable, $82 $49 $0F $DB $C1 are the five bytes of the float value in memory storage format, here, the calculation result of 355/113.
Alternatively, a small program in C can serve the same purpose, see here: CBM floating point data type.
The USR() function expects a value in FAC and returns a value in FAC and can thus serve as user defined function. As an example, you find a faster implementation of the square root function here: Basic Speedup Tip.
Another example of mine speeds up the calculation of the inner loop of a Mandelbrot fractal by a factor of 3, see here (the accompanying archive contains the source of the machine code routine): fast Mandelbrot fractal generator. This example also shows how to transfer multiple, variable float values to machine code as parameters of a SYS call.
- MrSterlingBS
- Vic 20 Enthusiast
- Posts: 174
- Joined: Tue Jan 31, 2023 2:56 am
- Location: Germany,Braunschweig
Re: How to use floating point routines?
http://sleepingelephant.com/ipw-web/bul ... 89&start=5
I have the book VC-20 intern read carfully.
After typing in the square route SQR routine i began to realize that there are two errors in the book.
This help me to understand a lot of coding.
It is the same routine like yours and WIMS ? Or is this another?
(mod: link to original replied-to post added for reference)
I have the book VC-20 intern read carfully.
After typing in the square route SQR routine i began to realize that there are two errors in the book.
This help me to understand a lot of coding.
It is the same routine like yours and WIMS ? Or is this another?
(mod: link to original replied-to post added for reference)
- Mike
- Herr VC
- Posts: 4841
- Joined: Wed Dec 01, 2004 1:57 pm
- Location: Munich, Germany
- Occupation: electrical engineer
Re: How to use floating point routines?
Fine! Just to note - this book serves me very well as reference. In that sense for myself I couldn't say I had ever 'completed' the book.MrSterlingBS wrote:I have the book VC-20 intern read carfully.
Where exactly? In the routine or somewhere else? Machine code is very typo-prone. Are you sure it wasn't a transferring error?After typing in the square route SQR routine i began to realize that there are two errors in the book.
Or do you mean a conceptual error? In my view, VC-20 intern tells the relevant things without too much fluff or superfluous extras around, which may make it a difficult read for people that need further explanations.
Actually, I did not try out the square root routine in VC-20 intern myself ...
... as my routine works the same, i.e. it uses the same algorithm (Heron's method) but a different implementation. Wim adapted my version for WimBASIC.It is the same routine like yours and WIMS ? Or is this another?
- MrSterlingBS
- Vic 20 Enthusiast
- Posts: 174
- Joined: Tue Jan 31, 2023 2:56 am
- Location: Germany,Braunschweig
Re: How to use floating point routines?
Im mean in the SQR routine.
They use in the Line 260 the A1TOA3 ; AKKU1 -> AKKU3
instead of the A1TOA4 ; AKKU1 -> AKKU4
and in Line 450 they uses the LDA opcode instead of the LDY.
Greetings
Sven
They use in the Line 260 the A1TOA3 ; AKKU1 -> AKKU3
instead of the A1TOA4 ; AKKU1 -> AKKU4
and in Line 450 they uses the LDA opcode instead of the LDY.
Greetings
Sven
- Mike
- Herr VC
- Posts: 4841
- Joined: Wed Dec 01, 2004 1:57 pm
- Location: Munich, Germany
- Occupation: electrical engineer
Re: How to use floating point routines?
(for reference: here are the pages 17, 18 and 19 of "VC-20 intern".)MrSterlingBS wrote:I mean in the SQR routine.
These errors are actually of the "transfer" kind. The authors manually tried to make a list file, and wrongly substituted that one mnemonic and that one symbol. The hex data however is correct and produces a functioning routine:They use in the Line 260 the A1TOA3 ; AKKU1 -> AKKU3
instead of the A1TOA4 ; AKKU1 -> AKKU4
and in Line 450 they uses the LDA opcode instead of the LDY.
I first do an "empty" memory dump with the M 033C 036D command to avoid typing in the running addresses in the left-most column.
Having typed in the hex data of the top part, I enter M 036E 0375 to continue the type-in process with the remainder of the routine.
Entering >00 4C 3C 03 (which gets "expanded" to >0000 4C 3C 03 AA D1 as seen above) then diverts the USR() vector to the new routine. PRINTUSR(10) does the smoke test and gives the expected result, 3.16227766 ...
... here's the routine for download: (db.sqr.prg), do POKE1,60:POKE2,3 to activate the routine after LOAD"DB.SQR.PRG",8,1 (and a NEW to correct some BASIC pointers).
- MrSterlingBS
- Vic 20 Enthusiast
- Posts: 174
- Joined: Tue Jan 31, 2023 2:56 am
- Location: Germany,Braunschweig
Re: How to use floating point routines?
Dear Mike,
thanks a lot for your feedback and help.
I tested and tried the file.
Du hat einen kleinen Tippfehler bei den Pokes gemacht. Es müsste natürlich Poke 2,03 heißen.
Testest du mich?
Another question. I use the VICMON @ $6000 or SYS 24576
Why do you use your Monitor @ $9800 or SYS38912
Best regards
Sven
thanks a lot for your feedback and help.
I tested and tried the file.
Du hat einen kleinen Tippfehler bei den Pokes gemacht. Es müsste natürlich Poke 2,03 heißen.
Testest du mich?
Another question. I use the VICMON @ $6000 or SYS 24576
Why do you use your Monitor @ $9800 or SYS38912
Best regards
Sven
- Mike
- Herr VC
- Posts: 4841
- Joined: Wed Dec 01, 2004 1:57 pm
- Location: Munich, Germany
- Occupation: electrical engineer
Re: How to use floating point routines?
Danke für den Hinweis. Ich hab's korrigiert - war keine Absicht.MrSterlingBS wrote:Du hat einen kleinen Tippfehler bei den Pokes gemacht. [...]
Because I can.Another question. I use the VICMON @ $6000 or SYS 24576
Why do you use your Monitor @ $9800 or SYS38912
- MrSterlingBS
- Vic 20 Enthusiast
- Posts: 174
- Joined: Tue Jan 31, 2023 2:56 am
- Location: Germany,Braunschweig
SQR (X) from Michael J. Mahon
Hello,
after some reading and comparisons about the floatpoint sqr on the VIC and C64 i found some interesting for the Apple II.
The code is from Michael J. Mahon, Copyright 2009. http://michaeljmahon.com/USR.SQR.pdf
I have changed the Apple code to VIC code. The VIC Cycles are from the routine known as herons method.
This routine was from the German VC-Intern book.
BR
Sven
after some reading and comparisons about the floatpoint sqr on the VIC and C64 i found some interesting for the Apple II.
The code is from Michael J. Mahon, Copyright 2009. http://michaeljmahon.com/USR.SQR.pdf
I have changed the Apple code to VIC code. The VIC Cycles are from the routine known as herons method.
This routine was from the German VC-Intern book.
BR
Sven
Code: Select all
SIGN equ $DC2B
ILLEGAL equ $D248
EXP equ $61
AKKU3 equ $57
AKKU4 equ $5C
COUNT equ $67
A1TOA3 equ $DBCA
A1TOA4 equ $DBC7
MEMDIV equ $DB0F
MEMPLUS equ $D867
FLPASC equ $DDDD
FPADD equ $D86A
FPSUB equ $D853
FPMULT equ $DA30
FPDIV equ $DB12
FPEXP equ $DF7B
FPABS equ $DC58 ; LSR $66
FPSIN equ $E268
FPCOS equ $E261
FPSQR equ $DF71
FLPINT equ $D1BF ; FACC = INT(FACC)
INTFLP equ $D391 ;
MVAFAC equ $DA8C
MVFACC equ $DBA2
STACK equ $0100
STROUT equ $CB1E
CLRSCR equ $E55F
WRTF equ $E742
PRTFIX equ $DDCD ; values form 0 to 65000 X - low; A - high
;The first of the two most used locations is called the Floating Point Accumulator.
;The second is called the Floating Point Argument. Depending on the source,
;they are called either “FAC1” and “FAC2”, or “FAC” and “ARG.”
;After an operation, FAC1 will hold the result.
;FAC1:
;$61 holds the exponent
;$62-$65 holds the mantissa
;$66 holds the sign in bit 7
;FAC2:
;$69 holds the exponent
;$6a-$6d holds the mantissa
;$6e holds the sign in bit 7
TEMP1 equ $fb
ARG equ $69
FAC equ $61
MOVAF equ $dc0f
EXTRAFAC equ $70
* = $1001
; BASIC program to boot the machine language code
db $0b, $10, $0a, $00, $9e, $34, $31, $30, $39, $00, $00, $00
sei ; stop interrupts
jsr CLRSCR ; clear screen
jsr Primm ; print on screen
db "SQUAREROOT: ",$00
ldy #>NUM419
lda #<NUM419
jsr MVFACC
jsr FLPASC
LDY #>STACK
LDA #<STACK
JSR STROUT ; print squareroot to calculate
ldy #>NUM419
lda #<NUM419
jsr MVFACC ; move floatpoint number to FAC again
; code from the book VIC-20 MACHINE LANGUAGE GUIDE Abacus Software
lda #$FF
sta $9129 ; decrements every 256 µS ; 4 cycles
lda #$00
sta $9128 ; decrements every µS ; 4 cycles
cld ; insure binary mode ; 2 cycles
clc ; clear carry flag ; 2 cycles
; ************** Start Counter with 10 Cycles ********************
; lda FAC ; Is FAC zero?
; beq out1 ; -Yes, sqrt(0) = 0.
; bit SIGN ; -No, is argument negative?
; bpl ok1 ; -No, go ahead.
;out1:
; jmp ILLEGAL ; -Yes, Illegal Quantity!
sqr:
ok1: lda FAC ; Divide exponent by 2
clc ; for initial approximation.
adc #$80 ; Move hi bit to C & clear it.
ror ; Divide by 2, restore hi bit.
sta FAC
bcc even1 ; If exp even, no adjust,
inc FAC ; If odd, inc exponent
lsr FAC+1 ; and shift it right.
ror FAC+2
ror FAC+3
ror FAC+4
ror EXTRAFAC ; (save low bit)
even1: jsr MOVAF ; Move rounded FAC to ARG
ldx #3
lda #0 ; Zero FAC & TEMP1 mants
clear1: sta FAC+1,x
sta TEMP1+1,x
dex
bpl clear1
ldx #32 ; Gen 32 bits of sqrt
bne start1 ; (always)
loopa: lda TEMP1+1 ; Compare 40 bits of X
cmp FAC+1 ; with current sqrt.
bne check
lda TEMP1+2 ; Compare next byte
cmp FAC+2
bne check
lda TEMP1+3 ; Compare next byte
cmp FAC+3
bne check
lda TEMP1+4 ; Compare next byte
cmp FAC+4
bne check
start1: lda ARG+1 ; Compare final byte
cmp #%01000000
check: bcc shift0 ; X > sqrt
lda ARG+1 ; X <= sqrt, so subtract
sbc #%01000000
sta ARG+1
lda TEMP1+4
sbc FAC+4
sta TEMP1+4
lda TEMP1+3
sbc FAC+3
sta TEMP1+3
lda TEMP1+2
sbc FAC+2
sta TEMP1+2
lda TEMP1+1
sbc FAC+1
sta TEMP1+1
shift0: ;rol32 FAC+1 ; Rotate C into sqrt
rol FAC+4
rol FAC+3
rol FAC+2
rol FAC+1
;asl32 ARG+1 ; Left shift X
asl ARG+4
rol ARG+3
rol ARG+2
rol ARG+1
;rol32 TEMP1+1
rol TEMP1+4
rol TEMP1+3
rol TEMP1+2
rol TEMP1+1
;asl32 ARG+1 ; by 2 bits.
asl ARG+4
rol ARG+3
rol ARG+2
rol ARG+1
;rol32 TEMP1+1
rol TEMP1+4
rol TEMP1+3
rol TEMP1+2
rol TEMP1+1
dex ; Done?
bne loopa ; -No, keep looping.
lda TEMP1+1 ; -Yes, compute final
cmp FAC+1 ; bit of sqrt
ror EXTRAFAC ; in FAC extension.
; rts
; ************** Stop Counter ********************
lda $9128 ; low
ldx $9129 ; high
sta $FB ; low
stx $FC ; high
jsr Primm
db $0d,$0d,"SOLUTION:",$00
jsr FLPASC ;FLPASC
ldy #>STACK
lda #<STACK
jsr STROUT
lda #$FF
sbc $FC
sta $FC
lda #$FF
sbc $FB
sta $FB
jsr Primm
db $0d,$0d,$0d,"CYCLES: ",$00
ldx $FB ; low
lda $FC ; high
jsr PRTFIX
jmp * ; infinity jump
NUM1:
db $8A,$7A,$00,$00,$00 ; Decimal 1000 SQR 31.6227766 CYCLES-VIC 10874 CYCLES-Apple 5168
NUM10:
db $84,$20,$00,$00,$00 ; Decimal 10 SQR 3.16227766 CYCLES-VIC 10467 CYCLES-Apple 5187
NUM66:
db $87,$04,$00,$00,$00 ; Decimal 66 SQR 8.12403841 CYCLES-VIC 10792 CYCLES-Apple 5283
NUM419:
db $89,$51,$80,$00,$00 ; Decimal 419 SQR 20.4694895 CYCLES-VIC 11021 CYCLES-Apple 5189
NUM3781:
db $8C,$6C,$50,$00,$00 ; Decimal 3781 SQR 61.4898366 CYCLES-VIC 11036 CYCLES-Apple 5246
Primm:
pla
sta $03
pla
sta $04
X10D6:
inc $03
bne X10DC
inc $04
X10DC:
ldy #$00
lda ($03),y
beq X10E7
jsr WRTF
bcc X10D6
X10E7:
lda $04
pha
lda $03
pha
rts
- MrSterlingBS
- Vic 20 Enthusiast
- Posts: 174
- Joined: Tue Jan 31, 2023 2:56 am
- Location: Germany,Braunschweig
integer sqrt15
Hi,
and an integer sqrt from Toby Lobsters comparison for your comparison.
https://github.com/TobyLobster/sqrt_test
I used the SQRT15 but i dont reach the fast execution.
BR
Sven
and an integer sqrt from Toby Lobsters comparison for your comparison.
https://github.com/TobyLobster/sqrt_test
I used the SQRT15 but i dont reach the fast execution.
BR
Sven
Code: Select all
; input in X (high byte) and A (low byte)
; output in A
; bytes: 476 (or maybe 421 if you can find something else to do with the XXXs)
; file memory (bytes) worst case cycles average cycle count
; sqrt15.a 512 120 35.7
CLRSCR equ $E55F
WRTF equ $E742
PRTFIX equ $DDCD ; values form 0 to 65000 X - low; A - high
mlo equ $30 ; input
mhi equ $32 ; input
tmp equ $34 ; temp
* = $1001
; BASIC program to boot the machine language code
db $0b, $10, $0a, $00, $9e, $34, $31, $30, $39, $00, $00, $00
sei ; stop interrupts
jsr CLRSCR ; clear screen
jsr Primm
db "SQUAREROOT:",$00
ldx #32 ; low
stx mlo
lda #3
sta mhi
jsr PRTFIX
; code from the book VIC-20 MACHINE LANGUAGE GUIDE Abacus Software
lda #$FF
sta $9129 ; decrements every 256 µS ; 4 cycles
lda #$00
sta $9128 ; decrements every µS ; 4 cycles
cld ; insure binary mode ; 2 cycles
clc ; clear carry flag ; 2 cycles
; ************** Start Counter with 10 Cycles ********************
; ***************************************************************************************
ldx mhi
lda mlo
sqrt15:
cpx #$40
bcc no
ldy sqtab-$40,X
bmi skip
cmp stab-$41,X
tya
adc #$80
;rts
jmp sqrtend
skip:
tya
jmp sqrtend
;rts
no:
cpx #0
beq lo
ldy #done-2-bran
sta tmp
txa
retry:
dey
asl tmp
rol
asl tmp
rol
cmp #$40
bcc retry
sty bran+1
tax
lda sqtab-$40,X
bmi bran
tay
lda tmp
cmp stab-$41,X
tya
adc #$80
bran:
bmi bran
lo:
cmp #$40
bcc nolo
tax
lda sqtab-$40,X
ora #$80
div16:
lsr
lsr
lsr
lsr
done:
;rts
jmp sqrtend
nolo
asl
cmp #$20
bcc nolo2
asl
tax
lda sqtab-$40,X
lsr
ora #$40
bne div16
nolo2
tax
lda smalltab,X
sqrtend:
;rts ;
sta mlo
ldx #0
stx mhi
; ************** Stop Counter ********************
lda $9128 ; low
ldx $9129 ; high
sta $FB ; low
stx $FC ; high
jsr Primm
db $0d,$0d,"SOLUTION:",$00
ldx mlo
lda mhi
jsr PRTFIX
lda #$FF
sbc $FC
sta $FC
lda #$FF
sbc $FB
sta $FB
jsr Primm
db $0d,$0d,$0d,"CYCLES:",$00
ldx $FB ; low
lda $FC ; high
jsr PRTFIX
jmp * ; infinity jump
Primm:
pla
sta $03
pla
sta $04
X10D6:
inc $03
bne X10DC
inc $04
X10DC:
ldy #$00
lda ($03),y
beq X10E7
jsr WRTF
bcc X10D6
X10E7:
lda $04
pha
lda $03
pha
rts
sqtab:
db $80,$00,$01,$02,$03,$04,$05,$06
db $07,$08,$09,$0a,$0b,$0c,$0d,$0e
db $8f,$90,$10,$11,$12,$13,$14,$15
db $96,$16,$17,$18,$19,$1a,$9b,$1b
db $1c,$1d,$1e,$9f,$a0,$20,$21,$22
db $a3,$23,$24,$25,$26,$a7,$27,$28
db $29,$aa,$2a,$2b,$2c,$ad,$2d,$2e
db $af,$b0,$30,$31,$b2,$32,$33,$34
db $b5,$35,$36,$b7,$37,$38,$b9,$39
db $3a,$bb,$3b,$3c,$bd,$3d,$3e,$bf
db $c0,$40,$c1,$41,$42,$c3,$43,$44
db $c5,$45,$46,$c7,$47,$48,$c9,$49
db $4a,$cb,$4b,$cc,$4c,$4d,$ce,$4e
db $cf,$d0,$50,$d1,$51,$52,$d3,$53
db $d4,$54,$55,$d6,$56,$d7,$57,$58
db $d9,$59,$da,$5a,$db,$5b,$5c,$dd
db $5d,$de,$5e,$df,$e0,$60,$e1,$61
db $e2,$62,$e3,$63,$64,$e5,$65,$e6
db $66,$e7,$67,$e8,$68,$69,$ea,$6a
db $eb,$6b,$ec,$6c,$ed,$6d,$ee,$6e
db $ef,$f0,$70,$f1,$71,$f2,$72,$f3
db $73,$f4,$74,$f5,$75,$f6,$76,$f7
db $77,$f8,$78,$f9,$79,$fa,$7a,$fb
db $7b,$fc,$7c,$fd,$7d,$fe,$7e,$ff
;XXX=$ee ; unused
stab:
db $01,$04,$09,$10,$19,$24,$31
db $40,$51,$64,$79,$90,$a9,$c4,$e1
db $ee,$ee,$21,$44,$69,$90,$b9,$e4
db $ee,$11,$40,$71,$a4,$d9,$ee,$10
db $49,$84,$c1,$ee,$ee,$41,$84,$c9
db $ee,$10,$59,$a4,$f1,$ee,$40,$91
db $e4,$ee,$39,$90,$e9,$ee,$44,$a1
db $ee,$ee,$61,$c4,$ee,$29,$90,$f9
db $ee,$64,$d1,$ee,$40,$b1,$ee,$24
db $99,$ee,$10,$89,$ee,$04,$81,$ee
db $ee,$81,$ee,$04,$89,$ee,$10,$99
db $ee,$24,$b1,$ee,$40,$d1,$ee,$64
db $f9,$ee,$90,$ee,$29,$c4,$ee,$61
db $ee,$ee,$a1,$ee,$44,$e9,$ee,$90
db $ee,$39,$e4,$ee,$91,$ee,$40,$f1
db $ee,$a4,$ee,$59,$ee,$10,$c9,$ee
db $84,$ee,$41,$ee,$ee,$c1,$ee,$84
db $ee,$49,$ee,$10,$d9,$ee,$a4,$ee
db $71,$ee,$40,$ee,$11,$e4,$ee,$b9
db $ee,$90,$ee,$69,$ee,$44,$ee,$21
db $ee
smalltab:
db $00,$e1,$01,$c4,$01,$a9,$01
db $90,$02,$79,$02,$64,$02,$51,$02
db $40,$02,$31,$03,$24,$03,$19,$03
db $10,$03,$09,$03,$04,$03,$01,$03
- Mike
- Herr VC
- Posts: 4841
- Joined: Wed Dec 01, 2004 1:57 pm
- Location: Munich, Germany
- Occupation: electrical engineer
Re: How to use floating point routines?
The timing code you use here has several issues.
1. First of all, the code makes an inconsistent read of the 16-bit timer value. After reading the low byte with LDA $9128 the high byte is not latched by the VIA, and when the high byte is read 4 cycles later by LDX $9129 this can result in an incorrect reading, like in this example:
... which results in a combined "reading" of $6F02. This is wrong.
This can be remedied by subtracting 4 from the accumulator value and nonetheless leaving out the correction with X (the probable underflow already has been taken care of by the VIA chip itself!). This then gives the timer value at the time LDX $9129 reads its high byte:
This now gives $6FFE in X/A as expected, and this also works in case there was no underflow. One important aspect also is, this correction itself also only needs a constant amount of cycles, which is helpful when eliminating a measuring offset (see below, point 3).
2. The bench framework does not disable interrupts while running the routine under test. When the standard interrupt kicks in, you will get a reading that is several 100 cycles off.
3. You should not need to count cycles for the bench framework's own infrastructure. It is much more simpler to set up a free running timer with maximum period (this needs the latch value $FFFE as the timer underflows to $FFFF before being restarted with the latch value), read the timer value two times, subtract those values for a time difference and eliminate any remaining offset by empirically timing out a suitable "empty" routine.
4. Embedding the prompts into machine code makes for a nice exercise, but is not strictly necessary. You could quite as well use a surrounding BASIC program to output the results (using defined memory locations to transfer the two 16-bit timer values).
...
That being said, with regard to the two routines you present here ("Apple II square root" and "integer 15 square root"), any Copyright claims of those people are questionable at best. IIRC there is prior art for that Apple II SQR routine within Acorn BBC BASIC. Furthermore, several variants of an integer square root routine for different CPU architectures (65xx, Z80, 68K, ARM, Intel - mostly found on Usenet) are lingering on the HDD of my PC for roughly three decades now, so there ...
P.S. I followed up with an own implementation of a stop-watch timer routine in the thread "Micro Measurement Timer"
1. First of all, the code makes an inconsistent read of the 16-bit timer value. After reading the low byte with LDA $9128 the high byte is not latched by the VIA, and when the high byte is read 4 cycles later by LDX $9129 this can result in an incorrect reading, like in this example:
Code: Select all
LDA $9128 ; timer value is $7002 - CPU reads low byte as $02
LDX $9129 ; timer value now is $6FFE (4 cycles later) - CPU reads high byte as $6F
This can be remedied by subtracting 4 from the accumulator value and nonetheless leaving out the correction with X (the probable underflow already has been taken care of by the VIA chip itself!). This then gives the timer value at the time LDX $9129 reads its high byte:
Code: Select all
LDA $9128 ; read low byte
LDX $9129 ; read high byte, 4 cycles later
SEC ; correct ...
SBC #$04 ; ... low byte
2. The bench framework does not disable interrupts while running the routine under test. When the standard interrupt kicks in, you will get a reading that is several 100 cycles off.
3. You should not need to count cycles for the bench framework's own infrastructure. It is much more simpler to set up a free running timer with maximum period (this needs the latch value $FFFE as the timer underflows to $FFFF before being restarted with the latch value), read the timer value two times, subtract those values for a time difference and eliminate any remaining offset by empirically timing out a suitable "empty" routine.
4. Embedding the prompts into machine code makes for a nice exercise, but is not strictly necessary. You could quite as well use a surrounding BASIC program to output the results (using defined memory locations to transfer the two 16-bit timer values).
...
That being said, with regard to the two routines you present here ("Apple II square root" and "integer 15 square root"), any Copyright claims of those people are questionable at best. IIRC there is prior art for that Apple II SQR routine within Acorn BBC BASIC. Furthermore, several variants of an integer square root routine for different CPU architectures (65xx, Z80, 68K, ARM, Intel - mostly found on Usenet) are lingering on the HDD of my PC for roughly three decades now, so there ...
P.S. I followed up with an own implementation of a stop-watch timer routine in the thread "Micro Measurement Timer"