Thinking of a SuperCPU VIC

eslapion · Post by **eslapion** » Thu Nov 03, 2011 2:40 am

Mike wrote:In the broad picture, yes.

Excellent.

The 6502 is removed, and replaced by a small logic which provides A14, A15 and Phi1, Phi2 from the clock input. The new CPU sits on an external cartridge. However, the extra signals do not necessarily need to be provided on the cartridge port, IMO they could just use a signal cable.

Although deciding wether we want the new CPU inside or on the cart port seems secondary to me at this point, your suggestion seems excellent.

I think it is important, that the new CPU is able to have its memory access stretched. It will put the required memory address on the bus, and the CPLD needs enough time to decide, whether the access can be satisfied by fast memory or it has to stall the CPU during the memory access for at least a full non-VIC cycle on the mainboard.

Yes.

I am a bit confused. I expected the CPU to be stalled for at leat one complete standard VIC cycle upon access by the new CPU to the old bus. Am I missing something?

Perhaps, on writes to the old bus, we could get the CPLD to latch the written address and value and simply make the required access when the old bus becomes available, freeing the new CPU to operate at high speed for subsequent cycles thereby creating a one byte write-back cache. Of course, this cannot be done for reads to the 6522 and VIC registers.

If this one byte cache is already loaded with data to write on the old bus and the new CPU performs another read or write destined for the old bus then we have no choice but to stall the CPU until the CPLD can complete the required accesses at slow speed.

I suggest we define a policy concerning accesses to the old bus.

1. All accesses to areas other than $1000-$1FFF or $9000-$9FFF go directly to the fast bus (with a possible temporary exception on the BLK5 area at startup to copy the data from the slow bus to ram on the fast bus)

2. All writes to $1000-$1FFF are "write-back" cached by the CPLD. The data is copied in both the fast bus's ram and copied to chip ram on the next available VIC cycle.

3. All reads to $1000-$1FFF are performed on the fast bus at high speed just as if it was meeting criterias for policy 1

4. All writes to $9000-9FFF are identical to policy 2 except there is nothing to write to at this address range on the fast bus. The old bus receives the written data on the next available VIC cycle.

5. All reads to $9000-$9FFF require a complete stall of the CPU for at least one complete VIC cycle. The CPLD stalls the CPU until the beginning of the next available VIC cycle and the subsequent half cycle, applies to the old bus the requested address, waits a little less than half a VIC cycle to retrieve the data, passes the information to the CPU as soon as it has been properly read from the old bus. Only once the operation is complete can the CPU be unstalled and returned to high speed operation.

Policy 5 is, of course, the most technically demanding and speed taxing.

I await your feedback on this.

TLovskog · Post by **TLovskog** » Thu Nov 03, 2011 4:47 am

I do not want to complicate things when the discussion is ongoing. However, since we will in any case introduce incompatibilities in the hardware, another idea would be to go full monty.

Create a new motherboard altogether with a physical compatibility with ...
# User port as usuall, but with a real hardware based RS232.
# Datasette port with it's tapedrive support ... or ... use the "hole" in the casing for a HDMI connector and SD connector.
# Cartridge as it is
# IEC port
# Audio / Video output with composite and S-Video.
# Original Keyboard

Hardware would then be ...
# FPGA with functions for VIC, CPU (including 1MHZ to ??? MHz dynamics), VIA and Glue
# RAM/FLASH (SDRAM?)
# Power management
# SD Disk
# HDMI (and Composite/S-Video) encoder

Then we could also add slightly better VIC performance (color depth, resolution, real raster interrupt, HDMI output), RTC, WiFi, etc

The trick is of course to enhance without loosing the VIC-20s "soul"

PhilRanger · Post by **PhilRanger** » Thu Nov 03, 2011 5:13 am

A switch to go "non turbo" sounds like a good idea to me if you don't want to have to remove the cart to play regular games and other time critical stuff.

Mike · Post by **Mike** » Sat Nov 05, 2011 8:58 am

eslapion wrote:I am a bit confused. I expected the CPU to be stalled for at leat one complete standard VIC cycle upon access by the new CPU to the old bus. Am I missing something?

There are three distinct cases:

1. the CPU accesses slow memory somewhere during the idle phase of VIC,
2. the CPU accesses slow memory just when VIC has released the slow bus,
3. the CPU accesses slow memory, while VIC has the slow bus

In any case, the CPLD has to check the memory address in a fraction of the CPU access cycle.

Then the three cases above result in the following time diagrams (warning: ASCII art):

Code: Select all

** CPU accesses slow memory, while VIC idle

     +--+  +--+  +--------------------------------------------------------------------------------------+
     |  |  |  |  |                                                                                      |
CPU  |  |  |  |  |                                                                                      |
     |  |  |  |  |                                                                                      |
...--+  +--+  +--+                                                                                      +--...

     1  2  3  4  5                    6                                7                                8

...--+                                +--------------------------------+                                +--...
     |                                |                                |                                |
VIC  |                                |                                |                                |
     |                                |                                |                                |
     +--------------------------------+                                +--------------------------------+

1. VIC releases bus, CPU accesses fast memory, CPLD places non-destructive read cycle on slow bus
2. CPU ends memory cycle
3. CPU accesses fast memory
4. CPU ends memory cycle
5. CPU accesses slow memory, is caught by CPLD, remaining CPU-VIC cycle too short for VRAM, I/O => stretch
6. VIC takes over slow bus
7. VIC releases bus, CPLD places CPU address on bus (read or write), waits for slow CPU cycle to end
8. VIC takes over slow bus again, next accesses of CPU might again be slow or fast


** CPU accesses slow memory just when VIC has released bus
** CPU accesses slow memory, while VIC has the slow bus

     +--------------------------------+  +--+  +--+  +--+  +--------------------------------------------+
     |                                |  |  |  |  |  |  |  |                                            |
CPU  |                                |  |  |  |  |  |  |  |                                            |
     |                                |  |  |  |  |  |  |  |                                            |
...--+                                +--+  +--+  +--+  +--+                                            +--...

     1                                2  3  4  5  6  7  8  9          10                               11 

...--+                                +--------------------------------+                                +--...
     |                                |                                |                                |
VIC  |                                |                                |                                |
     |                                |                                |                                |
     +--------------------------------+                                +--------------------------------+

1. CPU accesses slow memory right after VIC had released the bus, cycle is stretched.
2. slow memory has been accessed, VIC takes over the bus.
3. 5. 7. CPU accesses fast memory during VIC cycle, access granted.
4. 6. 8. CPU fast memory cycle ends.
9. CPU accesses slow memory during VIC cycle, cycle is stretched.
10. VIC releases bus, slow memory address is put on bus (read or write), waits for slow CPU cycle to end
11. VIC takes over slow bus again, next accesses of CPU might again be slow or fast

A write buffer, as you suggested, would simply provide another fast access for writes of the CPU to VRAM or I/O, until yet another slow access happens while the buffer is still full.

TLovskog wrote:another idea would be to go full monty [...] The trick is of course to enhance without loosing the VIC-20s "soul"

My general opinion about that is, that the hardware identity of a computer is mainly defined by its I/O and chipset, secondary to that, the video (and sound) circuitry, and only weakly dependent on CPU speed and RAM size.

If the hardware can be configured in a way, so the standard BASIC and KERNAL ROMs still run on it, and the display uses the VIC chip, we're home free. That does not exclude enhancements like my VFLI mod or your RAM-link like external SD-card storage solution.

BLK0, the colour RAM and the character RAM (mirrored from ROM and then write-protected) could be realised with dual ported RAM. This way they could be accessed by the CPU even when VIC does its own access. The only slow accesses that would remain then would be reads and writes to the VIC and VIA registers (and possibly to I/O2 and I/O3, if the cartridge port is kept). The remaining memory would be fast SRAM, eventually mirroring the contents of BASIC and KERNAL ROM in BLK6 and BLK7.

Ah, daydreaming on weekends ...

Greetings,

Michael

eslapion · Post by **eslapion** » Sat Nov 05, 2011 3:22 pm

TLovskog wrote:Hardware would then be ...
# FPGA with functions for VIC, CPU (including 1MHZ to ??? MHz dynamics), VIA and Glue
...

The trick is of course to enhance without loosing the VIC-20s "soul"

MikeJ made the FPGA VIC PAL only.

To me, the VIC-20's soul is NTSC and he wanted nothing of it so that ends it right there. Yet adapting his work to NTSC parameters would require very little work.

eslapion · Post by **eslapion** » Sat Nov 05, 2011 3:38 pm

Mike wrote:There are three distinct cases:

1. the CPU accesses slow memory somewhere during the idle phase of VIC,
2. the CPU accesses slow memory just when VIC has released the slow bus,
3. the CPU accesses slow memory, while VIC has the slow bus

In any case, the CPLD has to check the memory address in a fraction of the CPU access cycle.

Then the three cases above result in the following time diagrams (warning: ASCII art):
Code: Select all
graphics
A write buffer, as you suggested, would simply provide another fast access for writes of the CPU to VRAM or I/O, until yet another slow access happens while the buffer is still full.

I agree with this but I think this should only apply to reads or writes that occur when the one-byte write buffer already has something in it.

Also, this should be subject to the access policies I defined above:

1. All accesses to areas other than $1000-$1FFF or $9000-$9FFF go directly to the fast bus (with a possible temporary exception on the BLK5 area at startup to copy the data from the slow bus to ram on the fast bus)

2. All writes to $1000-$1FFF are "write-back" cached by the CPLD. The data is copied in both the fast bus's ram and copied to chip ram on the next available VIC cycle.

3. All reads to $1000-$1FFF are performed on the fast bus at high speed just as if it was meeting criterias for policy 1

4. All writes to $9000-9FFF are identical to policy 2 except there is nothing to write to at this address range on the fast bus. The old bus receives the written data on the next available VIC cycle.

5. All reads to $9000-$9FFF require a complete stall of the CPU for at least one complete VIC cycle. The CPLD stalls the CPU until the beginning of the next available VIC cycle and the subsequent half cycle, applies to the old bus the requested address, waits a little less than half a VIC cycle to retrieve the data, passes the information to the CPU as soon as it has been properly read from the old bus. Only once the operation is complete can the CPU be unstalled and returned to high speed operation.

As such, do you think you could "adjust" these policies to the timing requirements you just defined?

TLovskog · Post by **TLovskog** » Sat Nov 05, 2011 4:01 pm

eslapion wrote:MikeJ made the FPGA VIC PAL only.

Who is MikeJ and how did he get into the discussion about a VIC 20 implementation in FPGA?

Mike · Post by **Mike** » Sat Nov 05, 2011 6:09 pm

To just ponder about a complete redesign of the mainboard:

eslapion wrote:I agree with this but I think this should only apply to reads or writes that occur when the one-byte write buffer already has something in it.

A dual-ported RAM for BLK0 would provide a equally fast access, not only for one write each µs, but on every write cycle. It would really improve upon a single write buffer:

Also, this should be subject to the access policies I defined above: [...]
1. All accesses to areas other than $1000-$1FFF or $9000-$9FFF go directly to the fast bus (with a possible temporary exception on the BLK5 area at startup to copy the data from the slow bus to ram on the fast bus)

With the dual-ported RAM, accesses in BLK0 won't collide anymore. The whole access arbitration is being taken care of by the memory chip. As the VIC only does reads, there are also no write collisions. Access to RAM is thus always with full speed, regardless whether it is accessible by VIC or not.

Furthermore, expanding the VRAM to the entire BLK0 is one necessary detail to display VFLI images. The colour RAM should also be dual ported, with 16 times capacity to provide the 8x1 pixel colour resolution.

2. All writes to $1000-$1FFF are "write-back" cached by the CPLD. The data is copied in both the fast bus's ram and copied to chip ram on the next available VIC cycle.

3. All reads to $1000-$1FFF are performed on the fast bus at high speed just as if it was meeting criterias for policy 1

I presume you meant CPU half-cycle. This is automatically being taken care of by the dual-port RAM. Except there is no necessity to copy the data at all.

4. All writes to $9000-9FFF are identical to policy 2 except there is nothing to write to at this address range on the fast bus. The old bus receives the written data on the next available VIC cycle.

With the redesign of the mainboard, register reads and writes to the VIC and VIA registers (and, eventually, I/O 2/3) are the only slow accesses. I would prefer to handle them exactly as in my time diagrams above, with no re-ordering and no write-buffer, so software still could depend on a deterministic timing.

5. All reads to $9000-$9FFF require a complete stall of the CPU for at least one complete VIC cycle. The CPLD stalls the CPU until the beginning of the next available VIC cycle and the subsequent half cycle, applies to the old bus the requested address, waits a little less than half a VIC cycle to retrieve the data, passes the information to the CPU as soon as it has been properly read from the old bus. Only once the operation is complete can the CPU be unstalled and returned to high speed operation.

When the VIC has just released the bus, accesses to its registers or the VIA registers can commence immediately (case 2 in my timing diagrams), no need to wait the whole CPU half-cycle, the next VIC half-cycle, and then yet another CPU half-cycle to finish the access. The CPLD only needs to be fast enough to decide, whether the CPU accesses the registers. Again, the dual-port RAM would do the rest for non I/O accesses.

TLovskog wrote:Who is MikeJ and how did he get into the discussion about a VIC 20 implementation in FPGA?

He designed a complete VIC-20 inside the FPGA. That was some years ago, the implementation only provided unexpanded RAM, and it was generally agreed upon, that the VIC chip simulation was functional (agreeing upon fundamentals of the data sheet), but not completely accurate (regarding cycle exact behaviour).

To me, the VIC-20's soul is NTSC and he wanted nothing of it so that ends it right there. Yet adapting his work to NTSC parameters would require very little work.

Erm, sorry, let's not get into any TV system animosities here. Both VIC chip variants, PAL and NTSC, have equal claims to be used in a redesign.

Also, Francois, could you please shorten your quote of the timing diagram - it is already big enough to be shown once, on the same page.

I'm mainly bringing the mainboard redesign into play, as the external cartridge solution would - in a certain sense - anyway 'degrade' the original mainboard into a graphics adapter card for the external computer.

Mike · Post by **Mike** » Sun Nov 06, 2011 3:31 pm

Here's a candidate for the job as dual-port SRAM in BLK0, 5V operating voltage, TTL-compatible, 15, 25, or 55 ns cycle time, in production:

Cypress CY7C144E 8K×8 Dual-port Static RAM with SEM, INT, BUSY

eslapion · Post by **eslapion** » Mon Nov 07, 2011 1:43 am

TLovskog wrote:
eslapion wrote:MikeJ made the FPGA VIC PAL only.
Who is MikeJ and how did he get into the discussion about a VIC 20 implementation in FPGA?

See link:
http://www.fpgaarcade.com/vic20_main.htm

@Mike:
Using dual ported SRAM would solve a lot of access questions with regards to BLK0 but this would require disabling or removing the SRAM already present in the VIC. This is a major undertaking.

This is no longer a SuperCPU for the VIC-20 but rather a VIC-20 modding to give it a faster CPU.

I thought we could implement the timing and access policies discussed previously either with a CPLD or, if we're conservative enough using only a couple of 74HCTXXX chips.

This project is taking a direction that exceeds my technical know how.

One possible suggestion; if you want to go the dual port SRAM way then perhaps you want a product that will require removing the 6502 and plugging in some DIP connector and cable there as well as removing the 656X and piggybacking the VIC chip on top of the PCB of the accelerator on that socket.

This way, all the BLK0 RAM is on the accelerator anyways so all of it is VRAM. There is no need to modify the mainboard of the VIC-20. You could even declare all BLK 1, 2 and 3 as VRAM as well if you feel like it.

TLovskog · Post by **TLovskog** » Mon Nov 07, 2011 2:13 am

http://www.fpgaarcade.com/vic20_main.htm

I see. That was cool. Then we are half way there ...

Using dual ported SRAM would solve a lot of access questions with regards to BLK0 but this would require disabling or removing the SRAM already present in the VIC. This is a major undertaking.

This was a sidestep where the discussion was about creating a totally new motherboard. In that case a dualport SRAM would be the first choice.

Mike · Post by **Mike** » Mon Nov 07, 2011 5:29 pm

eslapion wrote:... perhaps you want a product that will require removing the 6502 and plugging in some DIP connector and cable there as well as removing the 656X and piggybacking the VIC chip on top of the PCB of the accelerator on that socket.

The only 'active' elements remaining on the mainboard then would be the VIA chips. Most of the glue logic, and the old SRAM chips wouldn't serve anymore any use. Then you could put the VIAs onto the cartridge as well, ditch the old mainboard, and put the 'cartridge' as new mainboard into the case of the VIC-20.

This way, all the BLK0 RAM is on the accelerator anyways so all of it is VRAM. [...] You could even declare all BLK 1, 2 and 3 as VRAM as well if you feel like it.

Unless you introduce yet another banking scheme, the VIC chip has only 14 address lines, and can only access 16K memory. It sees BLK4 as '$0000 .. $1FFF', and BLK0 as '$2000 .. $3FFF' in its own address range.

eslapion · Post by **eslapion** » Mon Nov 07, 2011 7:17 pm

Mike wrote:
eslapion wrote:... perhaps you want a product that will require removing the 6502 and plugging in some DIP connector and cable there as well as removing the 656X and piggybacking the VIC chip on top of the PCB of the accelerator on that socket.
The only 'active' elements remaining on the mainboard then would be the VIA chips. Most of the glue logic, and the old SRAM chips wouldn't serve anymore any use. Then you could put the VIAs onto the cartridge as well, ditch the old mainboard, and put the 'cartridge' as new mainboard into the case of the VIC-20.

I would have to disagree a little...

If you keep the original mainboard, you still have the power management architecture. The 2-Prong VIC and the VIC-20 Cr have completely different power supplies and connectors. They also have a completely different way of dealing with the power management of the datasette port.

There is also all the "glue electronics" (including analog electronics) arouind the 656X which takes care of clocking signals and converting what comes out of the 656X into manageable video and audio signals. These are very different from PAL to NTSC systems as well as on the 2-prong VIC vs the VIC-20 Cr.

If you keep the mainboard, you don't have to make 4 different accelerators. You can make one product that will fit in all 4 variants of the VIC-20 (PAL 2-Prong, NTSC 2-Prong, PAL VIC-20Cr, NTSC VIC-20 Cr)

This way, all the BLK0 RAM is on the accelerator anyways so all of it is VRAM. [...] You could even declare all BLK 1, 2 and 3 as VRAM as well if you feel like it.
Unless you introduce yet another banking scheme, the VIC chip has only 14 address lines, and can only access 16K memory. It sees BLK4 as '$0000 .. $1FFF', and BLK0 as '$2000 .. $3FFF' in its own address range.

Was just an idea. I guess having all the BLK0 RAM available to the VIC is already a great bonus.

TLovskog · Post by **TLovskog** » Tue Nov 08, 2011 12:28 am

Well, some more thoughts ...

If you keep the original mainboard, you still have the power management architecture. The 2-Prong VIC and the VIC-20 Cr have completely different power supplies and connectors. They also have a completely different way of dealing with the power management of the datasette port.

There is also all the "glue electronics" (including analog electronics) arouind the 656X which takes care of clocking signals and converting what comes out of the 656X into manageable video and audio signals. These are very different from PAL to NTSC systems as well as on the 2-prong VIC vs the VIC-20 Cr.

That I would actually consider a chance. Since all connectors are mechanically arranged on a aluminum faceplate, we can change that also. Lets fit a standard barrel connector for a standard power supply. Then make a modern power management circuitry with a switched power supply. Given that I have not checked all schematics (Only rev. B (PAL), D, E and N (NTSC)), but I can't see that big differences for the power to the datasette.

If we continue the route on creating a totally new mainboard where we house all electronics (the digital parts, including video/audio) in a fpga, then we can also place HDMI output, SD Disk and perhaps a USB input for modern joysticks (converted into VIC internal stuff) on the side.

If we use the old VIC/VIA etc parts on a new PCB then we have to deal with the analog parts. But that is not that different. Can probably be solved without to much grief. Perhaps different component placements, but same PCB.

One thing to consider if we go the full monty FPGA way, is that it would probably require a 4layer PCB. It would be cost prohibited to do the whole main board like that. So I guess a module with the fine pitch stuff is needed and then connected to a 2 layer PCB for the mechanical size.

Mike · Post by **Mike** » Tue Nov 08, 2011 2:11 am

eslapion wrote:If you keep the original mainboard, you still have the power management architecture. [...] The 2-Prong VIC and the VIC-20 Cr have completely different power supplies and connectors. They also have a completely different way of dealing with the power management of the datasette port. [...] There is also all the "glue electronics" (including analog electronics) arouind the 656X which takes care of clocking signals and converting what comes out of the 656X into manageable video and audio signals. These are very different from PAL to NTSC systems as well as on the 2-prong VIC vs the VIC-20 Cr.

Keeping an old PSU would be of the least concern to me. You can fit a redesigned mainboard into any case of the VIC-20, regardless whether it had been a 2-prong or CR before. Only the vertical room requirements should be met in any case. It is absolutely clear, that the audio/video part of the old mainboard would be incorporated into the new mainboard. There are just so many variants of it, because CBM tinkered around a lot there. This is of no significance, when the mainboard is replaced.

If you keep the mainboard, you don't have to make 4 different accelerators. You can make one product that will fit in all 4 variants of the VIC-20 (PAL 2-Prong, NTSC 2-Prong, PAL VIC-20Cr, NTSC VIC-20 Cr)

Simply put, I don't want to keep the old mainboard. One single new design, adaptable to a NTSC or PAL VIC chip.

TLovskog wrote:If we continue the route on creating a totally new mainboard where we house all electronics (the digital parts, including video/audio) in a fpga, then we can also place HDMI output, SD Disk and perhaps a USB input for modern joysticks (converted into VIC internal stuff) on the side.

A second joystick port and SD-card support, o.k., but I don't think HDMI is necessary. I've converted my VIC-20 to S-Video, and that is all about you need to get the video in good quality out of the base unit.

If we use the old VIC/VIA etc parts on a new PCB then we have to deal with the analog parts. But that is not that different. Can probably be solved without to much grief. Perhaps different component placements, but same PCB.

As long as there is no FPGA re-implementation of the VIC which is confirmed to have exact the same behaviour down to cycle level, I'd want to keep the original.

eslapion wrote:I guess having all the BLK0 RAM available to the VIC is already a great bonus.

It sure is.