Page 4 of 5

Posted: Fri Jan 13, 2012 11:12 am
by Kweepa
Kananga wrote:The Martin Galway Parallax uses 30 ZP locations, but only 24 are free. Not easy to resolve that clash, at least not without recompiling the SID.
Mike said earlier that 32 locations ($30-$4F) are free...?

Posted: Fri Jan 13, 2012 1:44 pm
by Kananga
Kweepa wrote:
Kananga wrote:The Martin Galway Parallax uses 30 ZP locations, but only 24 are free. Not easy to resolve that clash, at least not without recompiling the SID.
Mike said earlier that 32 locations ($30-$4F) are free...?
Not in the version I got and ZP area should be almost identical to the original (+ two bytes at label "excol" to preserve the extra color in $900e).
But see for yourself: Source

Posted: Fri Jan 13, 2012 6:04 pm
by Kweepa
Wow, that's it? Just 700 lines of code? Amazing!

I don't see why the NMI routine is jammed into the zero page. Seems like overkill. Surely it's not self-modifying to such a degree that moving it out would slow down the emulation...

At the very least you could OR the excol directly into the cliptab on startup.
Then you could have this shell outside the zero page:

Code: Select all

NMI:
pha
txa
pha
jsr nmi_on_zeropage
sta $900e
pla
tax
pla
bit $9114
rti
Which would save 14 more bytes.

Posted: Sun Jan 15, 2012 4:32 am
by andym00
It's rather massively self modifying really, but I think there's enough cycles to be saved that you could pull it out of the ZP without losing any speed with enough thinking..

Just looking quickly, looks like you can save 3 bytes on the

Code: Select all

;-      oscillator switch

o1wav_:
        jmp     o1off
since this is always preceded by the and #$1f which will set N appropriately, so this could be changed to:

Code: Select all

;-      oscillator switch

o1wav_:
        bpl     o1off
unless I'm missing something..

Also, there's 1 cycle per voice to be saved in the low byte handling of the phase accumulator..

Code: Select all

;-----  voice 1

o1acc_:
        lda     #$00                    ; o1acc
        clc
o1add_:
        adc     #$00                    ; o1add
        sta     o1acc_+1
o1acch_:
        lda     #$00                    ; o1acch
o1addh_:
        adc     #$00                    ; o1addh
        sta     o1acch_+1
By storing 01acc in a seperate ZP location you can do:

Code: Select all

;-----  voice 1

o1acc_:
        lax     o1acc_value          ; o1acc
o1add_:
        sbx     #$00                    ; o1add
        stx     o1acc_+1
o1acch_:
        lda     #$00                    ; o1acch
o1addh_:
        adc     #$00                    ; o1addh
        sta     o1acch_+1
Which means now a lda zp instead of lda imm, but saves the 2 cycles required for the clc as long as you're prepared to use the illegals so a saving of one cycle, but with no changed ZP cost..

So those 2 combined mean you can save 6 cycles each update, and 3 zeropage bytes..

I doubt the lax #imm works correctly on the Vic does it ? If it did, then you could save a further 1 byte and 1 cycle per voice as well.. But I don't know what happens with this instruction on the Vic-20.. On the 64 it's unstable, on the 128 it's stable..

I shall apply more thinking on my return from my sunday morning taxi duties to small human beings :)

Posted: Sun Jan 15, 2012 4:39 am
by Kananga
Kweepa wrote:Which would save 14 more bytes.
And used 12 additional cycles for an NMI routine that is called every 177 cycles.

Posted: Sun Jan 15, 2012 4:51 am
by Kananga
andym00 wrote:It's rather massively self modifying really
Hold that thought. ;)
andym00 wrote: Just looking quickly, looks like you can save 3 bytes on the

Code: Select all

;-      oscillator switch

o1wav_:
        jmp     o1off
since this is always preceded by the and #$1f which will set N appropriately, so this could be changed to:

Code: Select all

;-      oscillator switch

o1wav_:
        bpl     o1off
unless I'm missing something..
Um, look at that:

Code: Select all

;-----  table of selfmod locations

selfmodloc:
        .byte o1add_,o1addh_,o1wav_,o1pw_,o1ple_,o1phe_,o1lev_
        .byte o2add_,o2addh_,o2wav_,o2pw_,o2ple_,o2phe_,o2lev_
        .byte o3add_,o3addh_,o3wav_,o3pw_,o3ple_,o3phe_,o3lev_
So unless you are prepared to change the way ALL of these switches work in a consistent way, just do not touch any of them.

Posted: Sun Jan 15, 2012 5:36 am
by andym00
Kananga wrote: Um, look at that:

Code: Select all

;-----  table of selfmod locations

selfmodloc:
        .byte o1add_,o1addh_,o1wav_,o1pw_,o1ple_,o1phe_,o1lev_
        .byte o2add_,o2addh_,o2wav_,o2pw_,o2ple_,o2phe_,o2lev_
        .byte o3add_,o3addh_,o3wav_,o3pw_,o3ple_,o3phe_,o3lev_
So unless you are prepared to change the way ALL of these switches work in a consistent way, just do not touch any of them.
You don't need to change the switches, merely the values stored through selfmodloc+2 which is only the waveform selector..

Code: Select all

selfmodloc:
        .byte o1add_,o1addh_,o1wav_,o1pw_,o1ple_,o1phe_,o1lev_
        .byte o2add_,o2addh_,o2wav_,o2pw_,o2ple_,o2phe_,o2lev_
        .byte o3add_,o3addh_,o3wav_,o3pw_,o3ple_,o3phe_,o3lev_


;-----  table of wavetypes per channel

wavetypes:
        .byte o1off,o1tri,o1saw,o1pul,o1noi,0,0
        .byte o2off,o2tri,o2saw,o2pul,o2noi,0,0
        .byte o3off,o3tri,o3saw,o3pul,o3noi,0,0
And the value stored through the selfmodloc+2 offset is one of the above in the wavetypes table..

It's only storing a lowbyte of the offset of the destination function, there's already one for each voice anyway, that table merely needs modifying to store the relative distances of the branch for each waveform type rather than the absolute ZP location to jump to..

Although there's the 2 additional 0 bytes on the table I don't see how they can get referenced, in fact looks likes it just handy to have the table that size due to the Y register being used to index the modified jumptable destination (selfmodloc:) using the same index into the table wavetypes:

Posted: Sun Jan 15, 2012 11:34 am
by Kweepa
Sorry I jumped in half-cocked.
Ok, the NMI routine is about 160 cycles at most, so there's a tiny bit of wiggle room.

Code: Select all

NMI:
        sta     storea          		  ; store a & x
        stx     storex
        jmp nmi_on_zeropage
And at the end, jmp back to do:

Code: Select all

return_from_nmi:
        sta     $900e                   ; play sample

;-      return from NMI

        lda     storea                  ; restore a & x
        ldx     storex

        bit     $9114                   ; NMI clear
        rti
That's (just?) 4 extra cycles, since we can eliminate the excol ora. And saves 12 bytes on the ZP (including the ora).

andym, if LAX is unstable on the c64 I'm pretty sure it will be unstable on the VIC.

Posted: Sun Jan 15, 2012 1:09 pm
by FD22
I'm not aware that LAX is unstable on the revision of the 6502 used by the VIC. I've run code that uses that opcode on many VICs and never seen it misbehave. In fact, LAX is typically one of the more stable illegal opcodes in most 6502 variants, and is rarely troublesome. The only place I'd expect to see it fail is on the various CMOS 6502 families, where all illegal opcodes are mapped to NOP.

Can you tell us the specific board and CPU revision of the C64 on which you've observed that LAX is flakey?

EDIT: just noticed you're talking about the specific LAX #immediate addressing mode - this IS flakey on most 6502s and generally not included in lists of LAX addressing mode documentation. So the question flips around - what's the board and CPU of the 128 where it DOES work? ;)

Posted: Mon Jan 16, 2012 6:55 am
by Kananga
andym00 wrote: You don't need to change the switches, merely the values stored through selfmodloc+2 which is only the waveform selector..
You are right, it could be done. 3 well saved bytes. :)

Posted: Mon Jan 16, 2012 7:55 am
by andym00
FD22 wrote:just noticed you're talking about the specific LAX #immediate addressing mode - this IS flakey on most 6502s and generally not included in lists of LAX addressing mode documentation. So the question flips around - what's the board and CPU of the 128 where it DOES work? ;)
I've no idea :) It's merely mentioned on the Oxyron opcode charts as being stable on the 128..

Apart from lax #$00 which I'm beginning to use more now as it seems stable on the 64 and Ataris, I've never dared use any other value for it, although it does seem there's some good value you can get away with..

It was just a wild hope that it might work on the Vic, not that I'm in a position to try it on real hardware right now, but I will get around to it..

Okay, a short google later..
This is the place where lax #imm is mentioned as being stable on the 128.. http://oxyron.de/html/opcodes02.html

Here, it's disputed :) http://www.lemon64.com/forum/viewtopic. ... b4eeb9ea79

Posted: Mon Jan 16, 2012 11:30 am
by rhurst
46'50" into Reverse Engineering the MOS 6502 CPU video talks specifically about LAX / SAX, so it sounds like its stable in MOS 6502 but probably not in other implementations.

Posted: Mon Jan 16, 2012 1:20 pm
by tlr
andym00 wrote:Apart from lax #$00 which I'm beginning to use more now as it seems stable on the 64 and Ataris, I've never dared use any other value for it, although it does seem there's some good value you can get away with..
This may be of interest: vice-emu/testprogs/general/ane-lax/

IIRC lax #$00 is indeed stable on the machines we tested.

Posted: Tue Jan 17, 2012 7:32 am
by andym00
tlr wrote:This may be of interest: vice-emu/testprogs/general/ane-lax/

IIRC lax #$00 is indeed stable on the machines we tested.
Oh, now that is a lovely analysis of the whole ANE & LAX situation.. I had no idea that that page existed before..
Thanks for all the hard work you must have put into that analysis of its behaviour.. Very useful indeed and, for me at least, puts a lid on a lot of the ambiguity of that instruction.. :)

I also found this yesterday, might have been from a link in one of the files in your ane-lax test programs, or somewhere else entirely I really can't remember this morning, but it's most useful..
http://visual6502.org/wiki/index.php?ti ... (XAA,_ANE)

Posted: Tue Jan 17, 2012 11:55 am
by SparkyNZ
I just had a very quick look at the code.. How on earth is pulse width modulation achieved? Being rusty with 6502 assembly, I'd have to sit down for hours to understand everything before I can figure that one part out.

If anyone understand how pulse width modulation is achieved with the Vic's sound registers, I'd love to hear from you, please. I understand how the registers are "normally" used but I'd love to learn the trick of pulse width mod.