6502 asm - Swapping pairs of bits
Moderator: Moderators
6502 asm - Swapping pairs of bits
I would appreciate any help regarding swapping pairs of bits within a byte. This would be for on the fly modification of multicolor graphics based on a table maping values 0-3 to different values.
I've come up with following as a first stab,but it looks woefully slow and inefficient. Any ideas how to make it smaller and more importantly faster?
; Swap tables values for 0-3
SwapTable: .byte 00 ; 00 ==> 00
SwapTable1: .byte 10 ; 01 ==> 10
SwapTable2: .byte 01 ; 10 ==> 01
SwapTable3: .byte 11 ; 11 ==> 11
; Bit pairs to process
mask1 = %00000011
mask2 = %00001100
mask3 = %00110000
mask4 = %11000000
; Resultant value
Output: .byte 0
lda bits
tay
and #mask1
tax
lda SwapTable, x
sta Output
tya
and #mask2
lsr
lsr
tax
lda SwapTable , x
asl
asl
ora Output
sta Output
tya
and #mask3
lsr
lsr
lsr
lsr
tax
lda SwapTable , x
asl
asl
asl
asl
ora Output
sta Output
tya
and #mask4
lsr
lsr
lsr
lsr
lsr
lsr
tax
lda SwapTable, x
asl
asl
asl
asl
asl
asl
ora Output
sta Output
I've come up with following as a first stab,but it looks woefully slow and inefficient. Any ideas how to make it smaller and more importantly faster?
; Swap tables values for 0-3
SwapTable: .byte 00 ; 00 ==> 00
SwapTable1: .byte 10 ; 01 ==> 10
SwapTable2: .byte 01 ; 10 ==> 01
SwapTable3: .byte 11 ; 11 ==> 11
; Bit pairs to process
mask1 = %00000011
mask2 = %00001100
mask3 = %00110000
mask4 = %11000000
; Resultant value
Output: .byte 0
lda bits
tay
and #mask1
tax
lda SwapTable, x
sta Output
tya
and #mask2
lsr
lsr
tax
lda SwapTable , x
asl
asl
ora Output
sta Output
tya
and #mask3
lsr
lsr
lsr
lsr
tax
lda SwapTable , x
asl
asl
asl
asl
ora Output
sta Output
tya
and #mask4
lsr
lsr
lsr
lsr
lsr
lsr
tax
lda SwapTable, x
asl
asl
asl
asl
asl
asl
ora Output
sta Output
Last edited by beamrider on Wed Aug 28, 2013 4:00 am, edited 1 time in total.
- Mike
- Herr VC
- Posts: 4832
- Joined: Wed Dec 01, 2004 1:57 pm
- Location: Munich, Germany
- Occupation: electrical engineer
If it has to be fast, don't bother with saving memory. Use a table with all 256 byte values.
Otherwise:
Ideally, 'src' and 'dst' should be located in the zeropage.
Otherwise:
Code: Select all
LDX #4
LDA #0
.loop
ASL src
ROL A
ASL src
ROL A
TAY
LDA table,Y
ASL A
ROL dst
ASL A
ROL dst
DEX
BNE loop
.table
EQUB &00:EQUB &80:EQUB &40:EQUB &C0
-
- Vic 20 Afficionado
- Posts: 346
- Joined: Tue Apr 14, 2009 8:15 am
- Website: http://wimbasic.webs.com
- Location: Netherlands
- Occupation: farmer
You could do without the table. And no need to load A with 0 four times
Could go shorter with src in A:
Regards,
Wim.
Code: Select all
LDX #4
.loop
ASL src
ROL A
ASL src
ROL dst
LSR A
ROL dst
DEX
BNE .loop
Code: Select all
LDX #4
.loop
ASL A
PHP
ASL A
ROL dst
PLP
ROL dst
DEX
BNE .loop
Wim.
Last edited by wimoos on Tue Aug 27, 2013 12:42 pm, edited 1 time in total.
VICE; selfwritten 65asmgen; tasm; maintainer of WimBasic
- Mike
- Herr VC
- Posts: 4832
- Joined: Wed Dec 01, 2004 1:57 pm
- Location: Munich, Germany
- Occupation: electrical engineer
That's a nice solution indeed. It uses the Accumulator as temporary storage for the carry flag (ROL A and LSR A could be replaced by PHP and PLP to illustrate that better, but that would of course cost some cycles).
My variant is slightly more flexible though, in case beamrider wants another mapping of bit pairs. And I moved LDA #0 out of the loop just before you posted.wimoos wrote:And no need to load A with 0 four times
-
- Vic 20 Afficionado
- Posts: 346
- Joined: Tue Apr 14, 2009 8:15 am
- Website: http://wimbasic.webs.com
- Location: Netherlands
- Occupation: farmer
-
- Vic 20 Afficionado
- Posts: 346
- Joined: Tue Apr 14, 2009 8:15 am
- Website: http://wimbasic.webs.com
- Location: Netherlands
- Occupation: farmer
How about this one ? Source and destination both kept in A
Regards
Wim.
Code: Select all
ldx #4
.loop
asl
php
asl
adc #$00
plp
adc #$00
dex
bne .loop
Wim.
VICE; selfwritten 65asmgen; tasm; maintainer of WimBasic
-
- Vic 20 Afficionado
- Posts: 346
- Joined: Tue Apr 14, 2009 8:15 am
- Website: http://wimbasic.webs.com
- Location: Netherlands
- Occupation: farmer
I stand corrected. I should have tested it better.
But, I got it going:
Regards,
Wim.
But, I got it going:
Code: Select all
LDX #$04
.loop
ASL
BPL .bit7clr
ORA #$01
.bit7clr
ROL
DEX
BNE .loop
Wim.
VICE; selfwritten 65asmgen; tasm; maintainer of WimBasic
- Mike
- Herr VC
- Posts: 4832
- Joined: Wed Dec 01, 2004 1:57 pm
- Location: Munich, Germany
- Occupation: electrical engineer
@wimoos: How about these two?
Code: Select all
LDX #4
ASL A
.loop
PHP
ASL A
ADC #0
PLP
ROL A
DEX
BNE loop
Code: Select all
TAY
AND #$55
ASL A
STA temp
TYA
AND #$AA
LSR A
ORA temp
It might still be feasible to use my first given example to precalculate such a table as needed.beamrider wrote:This would be time critical so perhaps a look up table might be a good idea, although I would need a separate table per required set of mappings I guess.
- Mike
- Herr VC
- Posts: 4832
- Joined: Wed Dec 01, 2004 1:57 pm
- Location: Munich, Germany
- Occupation: electrical engineer
Hmm..., this one employs an EOR mask, but it's longer and needs two temporaries:
If you don't mind undocumented opcodes, you can replace the instruction pair ASL A/ORA temp2 with SLO temp2. The instruction pair LSR A/EOR temp also looked like a good candidate to be replaced by SRE temp, unfortunately this also changes the contents of temp, which are still needed in the final EOR temp instruction.
P.S. @beamrider: Would you mind changing the thread title to '6502 asm - Swapping pairs of bits'? You can do that by editing the top post.
Code: Select all
STA temp
LSR A
EOR temp
AND #$55
STA temp2
ASL A
ORA temp2
EOR temp
P.S. @beamrider: Would you mind changing the thread title to '6502 asm - Swapping pairs of bits'? You can do that by editing the top post.
Yep, my thoughts too - I wrote a sample last night, but couldn't get it below 31 bytes and 40-odd cycles before I had to quit, so didn't post it as Wimoos routine was shorter and faster. I might have another stab today if I get time.Kweepa wrote:I'm convinced this can be done more efficiently with an EOR trick. I'll try to find some time to work on it
@Mike, I made good use of LAX and SAX in my code...